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Preface 



This volume contains papers selected for presentation during the 24th Interna- 
tional Symposium on Mathematical Foundations of Computer Science held on 
September 6-10, 1999 in Szklarska Por^ba, Poland. The symposium, organized 
alternately in the Czech Republic, Slovakia, and Poland, focuses on theoretical 
aspects and mathematical foundations of computer science. 

The scientific program of the symposium consists of five invited talks given 
by Martin Dyer, Dexter Kozen, Giovanni Manzini, Sergio Rajsbaum, and Mads 
Tofte, and 37 accepted papers chosen out of 68 submissions. The volume contains 
all accepted contributed papers, and three invited papers. 

The contributed papers have been selected for presentation based on their 
scientific quality, novelty, and interest for the general audience of MFCS par- 
ticipants. Each paper has been reviewed by at least three independent referees 
— PC members and/or sub-referees appointed by them. The papers were se- 
lected for presentation during a fully electronic virtual meeting of the program 
committee on May 7, 1999. 

The virtual PC meeting was supported by software written by Artur Zgoda, 
Ph.D. student at the University of Wroclaw. The entire communication and 
access to quite a sensitive database at PC headquarters in Wroclaw was secured 
by cryptographic protocols based on technology of certificates. 

We would like to thank Artur Zgoda for his tremendous work in preparing 
the software support for this symposium. We also thank all PC members and 
the sub-referees for completing their demanding work during the short period 
between the deadline and the PC meeting. We would like to thank the organizing 
committee and the institutions that supported the meeting financially: Stefan 
Banach International Mathematical Center, Institute of Computer Science of the 
Polish Academy of Sciences, Adam Mickiewicz University in Poznan, Univer- 
sity of Wroclaw, and Polish Ministry of Education. Detailed information on the 
MFCS’99 meeting is available at URL http://www.tcs .uni.wroc.pl/infcs99/. 
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Abstract. Let /3 be a real number > 1. Addition and multiplication 
by a fixed positive integer of real numbers represented in base /3 eire 
shown to be computable by an on-line algorithm, and thus are continuous 
functions. When /3 is a Pisot number, these functions are computable by 
an on-line finite automaton. 



1 Introduction 

In computer arithmetic, on-line computation consists of performing arithmetic 
operations in Most Significant Digit First (MSDF) mode, digit serially after a 
certain latency delay [7]. This allows the pipelining different operations such as 
addition, multiplication and division. It is also appropriate for the processing of 
real numbers having infinite expansions. It is weU known that when multiplying 
two real numbers, only the left part of the result is significant. To be able to 
perform on-line addition, it is necessary to use a redundant number system (see 

[15], m 

We now tcike a different point of view. A function is computable by a finite 
automaton if it needs only a finite auxilisiry storage memory, independent of the 
size of the data. In that setting, one knows that addition of two integers in the 
classical b-aiy system is computable by a finite automaton, but that squaring 
is not (see [6]). Actually, the natural finite automaton one designs to perform 
addition is a sequential one, processing munbers in the Least Significant Digit 
First (LSDF) mode. 

On-line finite automata have been introduced by Muller [12]. They are se- 
quential finite automata processing data in MSDF mode. In integral base b on the 
canonical digit set {0, . . . ,6—1}, addition is not on-line computable, but with a 
balanced alphabet of signed digits of the form {—a, ... , a} with 6/2 < a < 6 - 1, 
using the algorithms of Avizienis [1] and Chow and Robertson [5], addition is 
computable by an on-line finite automaton (see [12], [11]). In the same spirit 
we have shown that, in a complex base of the form V 6 , where 6 is a relative 
integer such that | 6 | > 2 , and with digit set {— o, ... , a} with | 6|/2 < a < | 6 | - 1, 
addition is computable by an on-line finite automaton as well [10]. 

In this paper we consider a base /? which is a real number > 1, generally 
not an integer. For any j3 > 1, by the greedy algorithm of Renyi [14], one can 
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compute a representation in base 13 of any real munber from the interval [0, 1], 
called its /3-expansion, and where the digits are elements of the canonical digit 
set j 4 = {0, . . . , [/3J}. In such a representation, not all the patterns of digits 
are allowed (see [13] for instance). Recall that a Pisot number (also called a 
Pisot-VijayaraghavEm number) is an algebraic integer such that all its algebraic 
conjugates have modulus less than 1. The natmral integers and the golden ratio 
axe Pisot numbers. In a previous work, we have shown that addition in real 
base 13 is computable by a finite automaton if and only if /3 is a Pisot munber 
[2], but the automaton is not sequential, that is to say, data are not processed 
deterministically in MSDF nor in LSDF mode. Note that the result given by this 
automaton is the greedy /3-expansion of the sum of the munbers. On the other 
hand, it is known that it is not possible to find a sequenticd finite automaton 
reahzing addition and giving the result under the /S-expamsion form [9]. 

We will say that the addition is nonnormalized if the result belongs to the 
canonical digit set A, but need not be the greedy /3-expansion of the sum of the 
numbers. 

For some Pisot bases of a specicJ kind, nonnormahzed addition was known 
to be computable by an on-Une finite automaton, namely for bases P > 1 where 
P is the dominant root of an equation of the form 

X”* - - oA-”-2 aX-h 

where o > 6 > 1 are integers, and m > 2. The most well-known case is the 
golden ratio (1 -f \/5)/2, with m = 2, o = 6= l (see [10]). 

In this work we generalize this result to any Pisot number. 

The paper is organized as follows. First we show that for any real base 
P > 1 cmd any set of nonnegative digits D ^ A — {<d,. , [/3J}, the conver- 
sion which transforms a /3-representation with digits in D onto an equivalent 
/3-representation with digits in A (but not necessarily the /0-expansion) is com- 
putable by an on-line algorithm (Theorem 1). We then show that such a function 
is continuous for the product topology on the set D^, and the function induced 
on real numbers is continuous as well. 

When /3 is a Pisot number, the conversion is computable by ^ln on-line finite 
automaton (Theorem 2). Note that this result apphes to the case where /3 is 
axi integer and the alphabet A is equal to {0,... ,0}. The case /3 = 2 and 
A — {0, 1, 2} is the well-known “carry-save” representation used in computer 
arithmetic. 

Nonnormalized addition emd multiplication by a fixed positive integer are 
particular cases of digit set conversions. 

In the case where /3 is a Pisot number, one can define linear numeration sys- 
tems associated with /3, like the Fibonjicci muneration system associated with 
the golden ratio. In these systems, any natural number has a greedy represen- 
tation by an algorithm of Fraenkel [8]. Then the digit set conversion in a hnear 
numeration system associated with a Pisot number is computable by an on-hne 
finite automaton. 




On-Line Addition in Real Base 



3 



2 Prelimin8iries 

An alphabet A is a finite set. A finite sequence of elements of A is called a word, 
and the set of words on A is the free monoid A*. The empty word is denoted by 
e. The set of infinite sequences or infinite words on A is denoted by Let v 
be a word of A*, denote by n" the concatenation of v to itself n times, and by 
n" the infinite concatenation vvv . 



2.1 Beta-Representations 

Let ^ > 1 be a real niunber. A ^-representation of a number x of [0, 1] is an 
infinite sequence (dk)k>i such that dk(3~^ — x. 

Any real number x G [0, 1] can be represented in base /3 by the following 
greedy algorithm [14]: 

Denote by [.J and by {.} the integral part and the fractional part of a number. 
Let xi — l/3x\ and let ri = {/3x}. Then iterate for k > 2, Xk = \_^rk~\\ and 
Tk = {0rk-i}. 

Thus X = where the digits Xk are elements of the canonical al- 

phabet A = {0, . . . , [/3J } if /3 ^ N, A = {0, . . . ,/?-!} otherwise. The sequence 
{xk)k>i of A'^ is called the ^-expansion of x. When /? is not an integer, a num- 
ber X may have severed different /^-representations on A: this system is natmally 
redundant. The /3-expansion obtained by the greedy algorithm is the greatest 
one in the lexicographic ordering. When a ^^-representation ends with infinitely 
many zeroes, it is said to be finite. 

Let D be a digit set. The numerical value in base /3 on D is the function 
7T^ : — > R such that T^0{{dk)k>i) — Given two digit sets A 

and D, a digit set conversion in base from D to A is a function x ■ ^ 

such that for each sequence (dfc)fc>i G where x = vr^((dfe)fe>i) belongs to 
[0,1], there exists a sequence {ak)k>i G A”'* such that x = 7r^((afc)fe>i). Note 
that apriori the result of a conversion is not unique, but aU the processes we 
shall be consider later on are deterministic, eind thus compute fimctions. 

Remark that the image x{{^k)k>i) belongs to A^, but need not be the /3- 
expansion of x as computed by the greedy algorithm. 

To perform addition in base 0 the process is the following one: take two 
numbers v = ^fc>i 'Vk0~’‘ and y = Sfc>i Vk0~^ with Vk and pk in A, such that 
V + y & [0, 1]. Set Zk — Vk + yk- Then Zk is an element of B = {0, . . . , 2[y0J}, 
and V + y = Addition consists of transforming the representa- 

tion {zk)k>i of V A y on B into an equivident one {sk)k>i, such that v + y — 
Sfc>i with Sfc G A. Multiphcation by a fixed integer m > 1 is analogous: 

multiply by m each digit of the /3-expansion. This gives a sequence on the alpha- 
bet {0,... ,m[/3J}, to be converted into Ein equivalent /3-representation on A. 
Noimormahzed addition and multiphcation by a fixed positive integer are thus 
special cases of digit set conversion. 




4 



Christiane Frougny 



2.2 On-Line Computability 

Let X and Y be two finite digit sets, and let <phea function from to (for 
simplicity we consider only one- variable functions, but it is not a restriction). X 
is the input alphabet, and Y is the output alphabet. Following [15] and [7], we 
say that ip is on-line computable with delay S if there exists a natmal number 
5 such that, given x = {xk)k>i, to compute y = ip{x) = {yk)k>i, it is necessary 
and sufficient to have the digits x\, , Xk+s available to generate yk, for A: > 1. 
After the delay, one digit of the result is produced upon receiving one digit of X. 
It is well known that some functions are not on-hne computable, hke addition 
in the binary system with canonical digit set {0, 1}. Addition is considered as 
a conversion x from {0, 1,2} to {0,1}. Since x(01"20") = 10“ and x(01"0“) = 
01"0“ for any n > 1, one sees that the least significant digits have to be known 
to be able to output the most significant digit of the result. 

2.3 Automata 

We refer the reader to [6]. An automaton over A, A = {Q,A,E,I,T), is a 
directed graph labelled by elements of A. The set of vertices, traditionally called 
states, is denoted by Q, / C Q is the set of initial states, T c Q is the set of 
terminal states and EcQxAxQ is the set of labelled edges. If {p, a, q) 6 E, 
we note p q. The automaton is finite if Q is finite. The automaton A is 
deterministic if E is the graph of a (paurtial) function from Q x A into Q, and 
if there is a unique initial state. A subset H of A* is said to be recognizable by 
a finite automaton if there exists a finite automaton A such that H is equal to 
the set of labels of paths starting in an initial state and ending in a terminal 
state. A subset K of is said to be recognizable by a finite automaton if there 
exists a finite automaton A such that K is equal to the set of labels of infinite 
paths starting in an initial state and going infinitely often through a terminal 
state (Buchi acceptance condition, see [6]). 

In this paper we are interested in 2-tape automata. Let X and Y be two 
alphabets. A 2-tape automaton is an automaton over the non-free monoid X* x 
Y* : A = {Q,X* xY* ,E,I,T) is a directed graph the edges of which are labelled 
by elements of X* x Y*. Words of X* au:e referred to as input words, as words 

of Y* are referred to as output words. If (p, {f,g), q) G E, we note p q. The 
automaton is finite if Q and E Eire finite. The finite 2-tape automata axe also 
known as transducers. A relation R of X* x Y* is said to be computable by a 
finite 2-tape automaton if there exists a finite 2-tape automaton A such that R 
is equal to the set of labels of paths stcirting in an initial state and ending in 
a terminal state. A function is computable by a finite 2-tape automaton if its 
graph is computable by a finite 2-tape automaton. These definitions extend to 
relations and functions of infinite words as above. 

A sequential finite automaton is a finite 2-tape automaton where edges are 
labelled by elements of X xY*, and such that the underlying input automaton 
obtained by taking the projection over X of the label of every edge is determin- 
istic. An on-line finite automaton with delay S, A= (Q, X x (T Ue), E, {io}, w), 




On-Line Addition in Real Base 



5 



is a sequential automaton composed of a transient part and of a synchronous 
part (see [12], [11]). The set of states is equal to Q = Qt U Qg, where Qt is the 
set of transient states and Q, is the set of synchronous states. In the transient 
part, every path of length 6 starting in the initial state io is of the form 

. xi/e . X2/e xs/e . 

to ^ — > ■ ■ ■ — > ts 

where fo, . . . , is-i are in Qt, Xj in X, for 1 < j < S, and the only edge arriving in 
a state of Qt is as above. In the synchronous part, edges are labelled by elements 
of X X y. This means that the automaton starts reading words of length < S 
outputting nothing, and after that delay, outputs serially one digit for each input 
digit. 

For finite words, there is a terminal function u : Qg — > Y*, whose value 
is concatenated to the output word corresponding to a computation in A. The 
same definition works for fimctions of infinite words, considering infinite paths 
in A, but there is no terminal function w in that case. 

3 On-Line Digit Set Conversion in Real Base 

Let A = {0, . . . , [/3J } be the canonical alphabet associated with /?, and let 
D = {0, . . . , d} be a digit set containing A, that is, d> [/?J . 

Theorem 1. There exists a digit set conversion \ • In base /? which 

is on-line computable with delay S, where 6 is the smallest positive integer such 
that 

0^+^-i-d<0^m+i) {*) 

Proof. Clearly a number 5 satisfying (*) exists. Let k be the smallest nonnegative 
integer such that d/0^{0 — 1) < 1. In order to avoid overflow, all input words 
are supposed to begin with a run of k zeroes. 

On-line algorithm. 

Input: a sequence (d_/)j>i G such that di = ■ • • = dk — 0. 

Output: a sequence (aj)j>i € >1^ such that djp~K 

begin 

go ^ 0 

for i 4- 1 to d do 

9i ^ PQi-i + di 

J 1 

while j > 1 do 

zs+j ^ PQs+j-i + ds+j 
if zs+j < 

then Oj 4- [zs+j/f3^\ 
else aj i- [P\ 
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i ■«- j + 1 

end 

Proof of the algorithm. 

Claim: For all j > 1, one has 0 < < /S'* and aj € A. 

1. For 1 < i < we get 

Qi — 0' ^di + • • • + dj 

Then qi < qs < 0^ by hypothesis on the input (dj)j>i. 

2. Suppose that for some j > 1 

0 < qs+j-i < 0^ (H) 

• If zs+j < 0^~^^ then aj = yzs+j/0^\. Thus 0 < Oj < 0. Then q^^j = 
0^{zs+i/0^) < 0^. 

• If zs+j > 0^~^^ then aj = [/3J and qs+j — zs+j - [0\0^. Then qs+j > - 

[0\0^ > 0 - On the other hand, qs+j = 0qs+j-i+ds+j - [0\0^ < + d- [0\0^ 

by hypothesis (H), thus by condition (*), qs+j < 0^ ■ Hence the claim is proved. 

We then get, for all j > 1, 

^ , , ds+j _ Oi . O'j . Qs+j 

0 0^+i 0 03 0^+3 ■ 

Since qs+j < 0^, when j tends towards infinity, J2j>idj0~^ = 12j>i‘3'j0~^ > 
with the digits aj in A, thus xi{dj)j>i) = i<Zj)j>i- □ 

Remark 1. li 5 satisfies (*), then any natural 7 > i5 satisfies (*) as well. 

4 Continuity 

Let D be a finite alphabet. One defines a distance p on as follows: let v — 
(uj)j>i and w = {wi)i>i be in D^, p{v,w) = 2 ~’’ where r = min{i | V{ 7 ^ Wi}. 
The set is then a compact metric spaice. This topology is equivalent to the 
product topology. We first give a genercil result, not related to the base. 

Proposition 1. Let D and A be two finite alphabets. Any function — > 

.4^ which is on-line computable with delay 6 is 2^ -lipschitzian, and is thus uni- 
formously continuous. 

As a corollary we get the following result. 

Proposition 2. Let D be a set of nonnegative digits containing A. The digit 
set conversion \ in base 0 defined in Theorem 1 is uniformously 

continuous. 

The results presented below are a streiightforward generalization of those 
proved by Eilenberg [ 6 ] in the case where /3 is an integer. 
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Proposition 3. Let D be a finite alphabet of digits. The function numerical 
value 7T0 : — > K is continuous. 

We now consider functions taJdng their values in base 0 into the unit interval 
[0, 1]. Let A = {0, . . . , [/3J} and £) = {0, . . . ,d}, d> [/3J. A function x ■ ^ 

is said to be fi-consistent if there exists a fimction / : [0, 1] — > [0, 1] such 
that the diagram 



1T0 

[ 0 , 1 ] 









r 

-)• [ 0 , 1 ] 



commutes. 

Proposition 4. Any Junction ^-consistent which is on-line computable induces 
a function on real numbers which is continuous. 



5 The Pisot Case 

An algebraic integer is a root of a monic polynomial with integral coefficients. 
A Pisot number is an algebraic integer > 1 such that all its algebraic conjugates 
are smaller than 1 in modulus. 

Theorem 2. Let 0 be a Pisot number, let A = {0, . . . , and let D = 
{0, . . . , d} such that d> [/3J . There exists an an on-line finite state automaton 
with delay 6, where S satisfies (*), which realizes a digit set conversion x ■ L)^ — > 
A'** in base (3. 

Proof. Let M{X) be the minimal polynomial of (3, of degree m, and let (3i = 0, 
02, ■ ■■ 1 0m be its roots. For 2 < i <m, \0i\ < 1. Recall that A = Z[X]/(M(X)) 
is a discrete lattice of rank m. Define 

{ A if m < (i 

{g(A) = + ■ • • + 20 + Z-iX-^ + • • • + j X”'~^q(X) 6 A} 

otherwise 

The norm of an element q of As is taken as ||g|| = maxi<i<m |9(/3i)l- For 
2 < i < m set 

7i = sup{|c — a0f\ I c € D, a € A}. 

We define an on-line finite state automaton A — {Q,D x (A U e),E,{qo}) as 
follows. The set of synchronous states is equal to 

Qs = {liX) G Ai I 0 < q{0) < 0^ and for 2 < i < m, \q{0i)\ < 
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Since for any q in Qs, ||g|| is bounded, Qs is a finite set. The set of transient 
states Qt is defined by 

Qt — {qj{X) — \-dj mod M ) 1 < j < 5-1, di, . . . ,dj 6 D}u{go}- 

Note that if qj € Qt then qj{0) < 0^] and, for 2 < z < m, |gj(/3i)| < d/(l - 
|/?t|) < 7t/(l ~ |/?i|)- Hence transient states satisfy the same bound inequalities 
as synchronous states. For 1 < t < 5, transient edges are defined by 

with qo{X) = 0 and qi{X) = Xqi-i{X) + d{. 

The synchronous edges are defined by: for y > 1 and qs+j-i{X) € Q, define 
an edge 

such that 



Xg, 54 .j_i(X) + ds+j — X^Oj + qs+j{X) mod M[X) 

with Uj in A. For the choice of aj we process as in the on-line algorithm given 
in Theorem 1, replacing X by 0. Hence for all j > 0, 0 < qs+j{0) < 0^ . 

For 2 < i < m, we get 

\QS+j(0i)\ — \0iQS+j-l{0i) + ~ O’j0i \ < lAI I _ I + 7» = 2 _ |^.| ■ 

Thus, for all j > 0, qs+j G Qs- As in Theorem 1, if there is am infinite path in 
the automaton A starting in qo and labelled by 

di/e df/e dt+i/ai dj+2/02 

9o — > 9i ■ ■ • — > qs — t qs+i — > qs +2 • ■ • 
then Ylj>\dj0~^ = Y^jyiO-j0~K with the digits aj in A, and x{{dj)j>i) — 

Corollary 1. If 0 is a Pisot number, nonnormalized addition and multiplication 
by a fixed positive integer are computable by on-line finite state automata. 

6 Numeration Systems for the Integers 

Let us first recall some definitions. Let U = be a strictly increasing 

sequence of integers with uq = L Every positive integer s has a representation 
in the system U by the following greedy algorithm (see [8]): Let n such that 
Un < s < u„+i; let Sn be the quotient of the Euclidean division of s by it„, 
and let r„ be the remainder: s„ == g(s, u„) and r„ = r(s, u„). Then iterate Sk = 

q(rk+i, Uk) and rk = r{rk+\,Uk) for n - 1 < k < 0. Then s = s„u„ -i 1- sqUo- 

The digits Sk are such that 0 < Sfc < Uk+i/uk- The word Sn - sq is the normal 
C// 3 -representation of s. 
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Let us denote by dp{l) the /3-expansion of 1 computed by the greedy al- 
gorithm of Subsect. 2.1. If is finite, i.e. dp{l) — ai ■■■am, define a se- 
quence (tfe)fe>i as (tfe)fc>i = {ai • ■ ■ am-i{am ~ !))"■ If d/ 3 (l) is infinite take 
{tk)k>i = dp{l). A sequence Up — (u„)„>o can be canonically associated with 
/3 as follows. Let uq = 1 and for n > 1 let 

Ufi — -|- * * * - 1 - tfiViQ -|- 1 . 

We then have the following result [4]; the finite factors of /3-expansions of real 
numbers of [0, 1] and the normal IZ/j-representations of positive integers are the 
same. In particular, normal -representations of the integers are words on the 
alphabet A == {0, . . . , [/3j}. The system Up is the numeration system associated 
with p. 

Let X '■ ^ The prolongation of x is a fimction x : — > A*0“ 

defined by : let v and w be in D* such that xi'v) = w, then x(?^0“) = ry0“. The 
function x will be said to be continuous if its prolongation is continuous. By 
Theorem 1 and Proposition 2 we have the following result. 

Corollary 2. Let Up be the numeration system associated with a number 0 > 1. 
Let D be a set of nonnegative digits containing A. Then there exists a digit set 
conversion x ■ ^ system Up which is on-line computable and 

continuous. 

If /3 is a Pisot number then dp{l) is eventually periodic [3]. In that case, the 
sequence Up is linearly reciurrent. 

Corollary 3. Let Up be the linear numeration system associated with a Pisot 
number /3. Let D be a set of nonnegative digits containing A. Then there exists 
a digit set conversion x ■ D* — > A* in the system Up which is computable by 
an on-line finite state automaton with delay 6, where 5 satisfies (*). 

7 Examples 

We illustrate the previous results on well-known Pisot numbers. 

Example 1. Let 0 be an integer > 2. On the alphabet A = {0, ••• ,/3}, the 
representation of numbers is redimdant. Addition on A is computable by an 
on-line finite state automaton, with delay 2. 

For 0 — 2, this representation is the well-known caxry-save representation. 
We give below the on-line finite automaton for addition in base 0 = 2. Let 
A = (<5, {0, ■ • • 1 4} X ({0, 1, 2} U e), E, qo), with Q = QtU Qs- All input words 
begin with 00. The set of transient states is Qi = {go, 9i}- The set of synchronous 
states is 



Q, = (g(X) e Z[A]/(A - 2) I 0 < g(2) < 4} - {0, 1, 2, 3}. 



9o 



0/e 



Qi 




0 . 



Transient edges are 
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The transition matrix of the synchronous part of A is given in the array below: 
the entry (i,j) contains the label of egdes from state i to state j. 





0 


1 


2 


3 


0 


O 

O 


1/0 


2/0 


3/0 


1 


2/1 


3/1 


0/0, 4/1 


1/0 


2 


4/2, 0/1 


1/1 


2/1 


3/1 


3 


2/2 


3/2 


4/2, 0/1 


1/1 



Example 2. Let /3 = (l + \/5)/2 be the golden ratio, the associated linear numer- 
ation system is the Fibonacci numeration system. Then A — {0, 1}. Formula (*) 
gives (5 = 4 for addition, with D = {0, 1, 2}. I have given in [10] an on-line finite 
automaton for addition with delay 3. The automaton constructed with delay 4 
is not minim al in the number of states, but it is equivalent to the automaton 
with delay 3. In fact, the construction given in Theorem 1 and in Theorem 2 
works with delay S' = 3, the boimd on the states becoming 



0 < q{P) < p 

with here p = f3 + 2, and Condition (*) being replaced by 

0p + d< -i- p 






This gives the on-line finite automaton with delay 3 below. 

Let A = (Q, {0,1,2} x ({0,1} Ue),^, {go})- Input words begin with 00. The 
set of transient states is { 9 o> 9 i) 92 }- The elements of A are denoted by words of 
length 2 The transient part of A is of the form 




In the 



j no ni 2/e 

gi — >■ Q 2 and Q 2 — > 00; q 2 — > 01; q 2 — > 02. 

synchronous part of A edges ^lre the following ones: 



■ 


li!il 


E3 






fO 


ES 


m 


liSl 


la 


BTil 


OS 


M 


EKl 






■ 


■ 




■ 




■ 


b 




■ 


■ 


ES] 




m 


■ 


■ 




■ 


OS 


E9Z! 








■ 


m 








M 


ES 


EB! 


m 


















m 








EBI 




EBI 






■ 


■ 


EQ 








M 


ES) 


■ 


ss 


■ 




■ 


m 








m 


■ 


ES] 


■ 




M 


■1 


OS 












ESI 


n 




EB! 


■ 


EB 


■ 




EQ] 


■ 


■ 






m 


H 


■ 




■ 


EB! 


EB! 


■ 


■ 


■ 


■ 




■ 


■ 



Example 3. Let /3 = (3 -f \/5)/2. Here A = {0,1,2}. For addition the delay 
computed by (*) is 3, which is minimal. 



^ We denote by 1 the signed digit — 1. 
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Abstract. We propose the study of query languages for databeises in- 
volving real numbers as data (called real number databases in the sequel). 
As main new aspect our approach is based on real number complexity 
theory as introduced in [8] and descriptive complexity for the latter devel- 
oped in [17]. Using this formal framework a uniform treatment of query 
languages for such databases is obtained. Precise results about both the 
data- and the expression-complexity of several such query languages are 
proved. More explicitly, relying on descriptive complexity theory over K 
gives the possibility to derive a hierarchy of complete languages for most 
of the important real number complexity classes. A clear correspondence 
between different logics and such complexity classes is established. In 
particular, it is possible to formalize queries involving in a uniform man- 
ner real spaces of different dimensions. This can be done in such a way 
that the logical description exactly reflects the computational complex- 
ity of a query. The latter might circumvent a problem appearing in some 
of the former approaches dealing with semi-algebraic databases (see [20] 

, [18]), where the use of first-order logic over real-closed fields can im- 
ply inefficiency as soon as the dimension of the underlying real space is 
not fixed - no matter whether the query under consideration is easy to 
compute or not. 



1 Introduction 

Semi-algebraic databases have raised increasing interest in recent years. Inspired 
by problems of computational and semi-algebraic geometry in the meanwhile 
there can be found many approaches dealing with real number data and real 
polynomial inequality constraints. To give a (by no means complete) list of ref- 
erences consider for example [20], [25], [19], [4], [5], [6], the survey paper [18] as 
well as the literature cited in there. 

An important task in database theory is to study the complexity of queries 
asked to databases. In relation with finite model theory and descriptive com- 
plexity theory this led to numerous results on the complexity of several query 
languages described by logical means (see [1] for a smvey). For the present 
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work the papers [9] and [26] are of extreme interest. Here the notions of data- 
and expression-complexity were introduced and studied; important complex- 
ity classes in the Turing model were captured by the data- resp. expression- 
complexity of several such query languages. Our goal is to set up a similar 
theory for databases involving real numbers. However, in order to deal in a com- 
parable manner with data- and expression-complexity also in connection with 
real number data problems arise. It seems not to be clear how to treat them 
appropriately as long as restricting to the so far used concepts. Thus, before 
starting with the mathematical reasoning let us consider an elementary example 
from computational geometry and explain some of these problems. 

Example 1. Given n points in as database the task is to select those pairs 
that have the largest Euchdean distance. Following the approach taken in the 
seminal paper [20] (and many others as well, for example [5], [14], [19], [25]) this 
problem would be modeled as follows: the database is a finite relation C 
consisting of the n points. It is given as a quantifier free first-order formula over 
the reals having two free variables (and the formula is true exactly for the points 
in the database). Now in order to check whether a pair {x,y) of points reahzes 
the largest distance with respect to pmrs in 72.^ analyze the following first-order 
formula over the reals with the additional symbol 72 in the vocabulary (here 
X = {xi,X2) etc.): 

V’(a;i,X2,2/i,2/2) = '/^(a:i,a;2) A72 (j/i,j/2) A {'^vi,V2,wi,W2 

i1l{Vi,V2) MZ{wi,W2)) => (xi - yi)^ + (X2 - y2)'^ > (t^l ~ U'l)^ + (t^2 ~ W2)^} 

First of all the above problem is a reEil number problem; that is the point 
coordinates are real numbers and the complexity of the problem is depending on 
the quantity n of difi^erent points. Thus it is a problem perfectly suited to deal 
with in a corresponding model of computation. Such a model was introduced by 
Blum, Shub, cuid Smale (shortly: BSS) [8] and will be the starting point of our 
investigations. Second (and more importcint) consider the logical structme of ip 
: the query is presented as a quantified first-order formula over the real closed 
field R. In general, the complexity of evaluating such formulas is supposed to be 
tremendous since it results in quantifier elimination. In the BSS setting deciding 
existential first-order formulas with polynomials involved having at most degree 
2 is already iVF]R-complete over K ([8]). Hence the logical shape of the above 
query does not reflect its complexity. Though the query is easy to compute it is 
expressed by a (quantified) first-order formula. In [20] this problem is circum- 
vented by considering only databases represented by relations in a fixed space 
R*". Thus if a query on such a database is analyzed a constant bound on the 
quantifier depth is obtained. In such a situation the corresponding (constantly 
many) quantifiers can be eliminated efficiently and tractability results on the 
data complexity of such a query axe obtained (in fact, this complexity is in 
NC). However, one may argue that there is no necessity to consider the above 
largest distance problem only in R^ . It makes perfect sense to regard it as a single 
geometrical problem being posed for sets in any R* and, in fact, computation- 
ally is not harder there (of course the complexity will depend on the number of 
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points and the varying dimension k). Similcirly, when emalyzing the complexity 
of query languages over finite structimes data-complexity is measured by fixing 
a query and evaluating it on different databases, where the latter are given as 
finite structures of different sizes (see [26]). In the above formulation consider- 
ing the dimension k as part of the input would cause serious troubles since then 
the complexity of quantifier elimination enters into the consideration though the 
query itself remains tractable. The above problem also sheds light on a fmther 
one; the quantifiers for V{ and w, in a particular sense are “easy” since they only 
range over components of points in 72.. In some of the approaches to real num- 
ber databases such a difference is present when distinguishing between so-called 
natural and active domains (see [5], [6] as example). Here, very interesting work 
has been done to analyze when it is possible to replace the natural domain of 
a quantifier by the active one. However, even if it is possible to perform such a 
replacement this process once more can involve quantifier elimination and thus 
might destroy a possibly easy shape of a quantifier. 

In the present paper we propose the use of descriptive complexity theory over 
the reals - as introduced in [17] - to study the complexity of query languages for 
real number databases. It allows to set up precise relations between different such 
languages and real number complexity classes in the sense of [8]. Using meta- 
finite model theory [16] in a specific sense also separates “easy” from “difficult” 
quantifiers by allowing an exact distinction between discrete and continuous as- 
pects of computations resp. queries. As a consequence formulas involving quan- 
tifiers which intrinsically range over an infinite domain will belong to a different 
logic than those incorporating bounded ones. For example the results show the 
(generalized) largest distance query to be describable by a first-order fixed-point 
formula on so-called E-structmes; such formulas only involve bounded quanti- 
fiers, thereby directly implying the largest distance problem to be computable 
in polynomial time (over E). 

Parallelizing the foregoing in [26] the notions of data- and expression-com- 
plexity for real number query languages are introduced. For the main logics on 
(ordered) E-structiures we study these quantities and give completeness results. 
Basics on the BSS model of computation over E and on descriptive complexity 
theory for it can be found in [7] and [17]. In section 2 we introduce the notions 
of a real number query, data- and expression-complexity and -completeness for 
a query language. Completeness results are proved in section 3. 



2 Real number databases and queries; data- and 
expression complexity 

Once having introduced descriptive complexity over E it is straightforward how 
to base the notion of real number databases and real number queries on it. We 
closely follow the approaches in [9],[26],[1] by performing the changes necessary 
when working over the reals. 

We briefly recall the notion of an E-structure from [17]. For a more explicit 
introduction see [12] , [17], [22]. 
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Definition 1. a) Let La,Lf be finite vocabularies where La may contain re- 
lation and function symbols, and Lf contains function symbols only. A R- 
structure of signature a = (La,Lf) is a pair S = (A,^) consisting of 

(i) a finite structure A of vocabulary La, called the skeleton of D, whose 
universe A will also be said to be the universe of S , and 

(ii) a finite set T of functions X : A'^ — )• R interpreting the function symbols 
in Lf. 

b) Let D be a ^.-structure of skeleton A. We denote by jAj the cardinality of 
the universe A of A. A R-structure D = {A, J-) is ranked if there is a 
unary function symbol r £ Lf whose interpretation p in T bijects A with 
{0, 1, . . . , |A| — 1}. The function p is called ranking. A k-ranking on A is a 
bisection between A'‘ and {0, 1, . . . , — 1}. 

Example 2 ([17]). Let us see how to describe the question whether a degree 
four polynomial has a real zero with cin existential second-order sentence. This 
problem is known to be NPR-complete in our framework. Consider the signature 
(0, {r, c}) where the arities of r and c cure 1 and 4 respectively, and require that 
r is interpreted as a ranking. 

Let 2) = (.4, !F) be any R-structure where T consists of interpretations C : 
R and p : R of c and r. Let n = |j4| - 1 so that p bijects A with 

{0, 1, . . . , n}. Then D defines a homogeneous polynomial g £ R[A"o, . . . , X„] of 
degree four, namely 



g= Y. C{i,3,k,l)XiXfXkXi. 



We obtain an arbitrary, that is, not necessarily homogeneous, polynomial g £ 
R[Xi , . . . , Xn] of degree four by setting Xq = 1 in g. We also say that D defines 
g. Notice that for every polynomial g of degree four in n variables there is a 
R-structmre D of size n-\-l such that D defines g. 

Denote by o, 1, o and 1 the first and last elements of A and c4‘* with respect to 
p and respectively. The following sentence quantifies two functions X : A R 
and Y : > R 

r] = (3x)(3y) (y(o) = c(6) & y(i) = o & x(o) = i & 

& Vui . . . Vu4 [u 6 =i>- 3vi . . . 3t;4 (p‘*(u) = p‘^{v) + 1) 

& y(u) - y(u) -I- C(U)X(U1)X(U2)X(U3)X(U4)]). 

Here, if Uj = p~^{i) fori = 1,... ,nthen, (X(ai),... ,X(a„)) £ R" describes the 
zero of g and Y (u) is the partial sum of all its monomials up to u — (ui, . . . , U4) £ 
A* evaluated at the point (X(ai),... ,X(a„)). 

The sentence V' describes our decision problem in the sense that for any R- 
structure 2) it holds 2) |= ^ if and only if the polynomial g of degree fom defined 
by 2) has a real zero. 
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A straightforwaxd application of the above example in particular shows how to 
model any semi-algebraic set by means of an R-structure: if the semi-algebraic 
set is described by a system of polynomial equalities and inequalities each of 
which is bounded by degree d, the coefficient vectors of the involved polynomials 
are represented by interpreting accordingly function symbols of the R-structure. 

Definition 2. Fix a signature a = (La,Lf) of finite vocabularies Lg and Lf 
where Lg may contain relation and Junction symbols and Lf contains function 
symbols only. 

a) A real number database of type a is a (ranked) ^-structure T> of signature a. 

b) Fix a natural number k. A real number query Q of type {a, k) is a partial map 
assigning (if defined) to every real number database S of type a a relation 
of arity k defined on the finite universe of 35 . 

c) A real number query is computable if it is BSS-computable with respect to 
the usual encoding of R-structures as inputs for BSS machines [1 7]. 

Remark 1. a) In principle, the notion semi-algebraic database would fit for the 
objects considered here as well. Since this notion is already used in the theory 
of constraint databases we prefer to call our objects real number databases 
(even though the main objects of our databases are semi-algebraic sets). 

b) Real number queries could be defined more generally by mapping 55 to an- 
other database of (possibly) different signature. The expositions in [1] and 
in [26] for such a restriction mutis mutandis in our setting hold as well. 

c) Just as in descriptive complexity theory over finite structures the problem 
of genericity in our setting appears as well (see [17]). In principle one can 
think about similar ways to deal with it in connection with R-structures 
such as defining generic queries (cf. [1]) or extending the definition of special 
devices such as “generic machines” [2] or reflectional relational machines [3] 
to the BSS model. Since we want to focus on outlining our approach for real 
number databases throughout this paper we assume all R-structures to be 
ranked. 

In this paper we are considering real number query languages arising from 
different logics on R-structures. Given a formula p(ai,... ,ag) in a particular 
logic for R-structures of signatiue a and with free variables oi , . . . , Ug ranging 
over the finite universe, the corresponding query Qp is defined by assigning 
to an R-structure 35 of signatiue a that relation of arity s determined by all 
assignments for the a,- making p a true sentence in 35. 

Definition 3. a) Given a real number query Qp of signature {a, k) and corre- 
sponding to a formula p in the above sense its graph is the set Gr{Qp) 

{(a,S)ja€ A^35 hp(a)}- 

b) Given a real number database 35 of signature a its graph with respect to a 
language L is the set Gr^^D) := {(a,p)\p is a formula in L of arity s,a € 
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We are interested in the BSS complexity of deciding membership in one of 
the two graph sets defined above. This means either the language is fixed and we 
measure the complexity for varying real number databases as function in the size 
of the database; or vice versa the database is fixed and the complexity for varying 
formulas in a given language is studied as function in the formula length. In the 
following the notion of completeness of particular languages refers to reduction 
algorithms running simultaneously in polynomicd real time and constant space. 
This type of reductions for real number problems is considered here for the first 
time. Our results substantiate that the according class (Pr, const) can be seen 
as an analogue of logspace in the BSS setting. 

Definition 4. a) The data-complexity of a language L for real number data- 
bases is the BSS complexity of deciding membership of (a,D) in Gr{Qp) for 
any fixed p € L. A language L is data-complete for a real number complexity 
class C if 'ip & L deciding membership in Gr{Qp) is in C and furthermore 
there exists a po & L such that the latter problem for the graph Gr{Qpp) is 
complete in C with respect to reductions in (P^, const), 
b) The expression-complexity of a language L for real number databases is the 
BSS complexity of deciding membership of {a,p) in Gri{D) for any fixed 
R-structure S and p £ L is a formula over the corresponding signature. A 
language L is expression-complete for a real number complexity class C if 
for any fixed real number database 3) the graph membership problem is in C 
and furthermore there exists a real number database So such that the latter 
problem with respect to So is complete in C under reductions in (Pr, const). 
Here complexity is measured in the length of a given formula p. The lat- 
ter intuitively is the number of symbols necessary to write down p, where 
quantifiers, symbols from the signature, logical and real arithmetic operation 
symbols and - most important - real numbers are supposed to have unit size. 
We skip a formal definition which is easily done by induction. 

In the following we will notationally identify a language for real number 
databases and the real number queries resulting from it. 



3 Completeness results 

For most of the logics introduced in [17] and [12] completeness results for both 
data- and expression-complexity are shown now. The results especially settle 
a precise correspondence between the logical structure of a real number query 
and its complexity. This removes the dfficulties arising when formalizing easy 
queries already with first order logic for reed closed fields. 

For a precise definition of the logics under consideration we refer to [17], [12] 
or the Appendix. 

Theorem 1. The language FO^ of first order logic without quantifiers is both 
data-complete and expression-complete in (Pr, const). 
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Proof. Since we consider reductions in class (Pr, const) for both statements it 
suffices to show membership of the corresponding graph problem in (Pr, const). 
Let a formula p G FO^ with s free variables be fixed. Consider a real number 
database D = (.A, J-) of size n and an a G A'* as input. Evaluating p in a cjm be 
done in a munber of steps proportionzJ to the length of p, which is independent 
of n. Hence there exists a BSS algorithm working in polynomial (even constant) 
time and constant space. 

Now let the real number database be fixed and consider as input an a G A® 
together with p G FO^ of length n. In order to evaluate p in a we can pro- 
ceed as in [21] for evaluating Boolean expressions. However, the corresponding 
algorithm has to be performed simultaneously in polynomial time and constant 
space. (Note that Michaux’ result [24] guarantees this task to be executable in 
constant space, but his algorithm in general might result in a super-polynomial 
slow-down). Nevertheless, a closer analysis of Lynch’s method shows that in this 
particular situation Michaux’ foregoing will not result in a slow-down: The basic 
ingredient of Lynch’s method is the organization of how to cycle through the sub- 
formulas of p. To this aim in the Turing model at most O(logn) many columns 
are introduced to record specific sub-results (i.e. values in {0, 1}) and to indi- 
cate which part is going to be evaluated next. Hence, we get at most O(logn) 
integers of bounded value (independently of n) to store. This means that the 
entire information can be stored in a single integer of magnitude at most 2^ 
(as a very crude upper bound). The key point with respect to Michaux’ cod- 
ing argument now is that there is only one single integer M < 2^” holding all 
the necessary information. In this particular situation his method works with- 
out super-polynomial time trade-off; the information in M can be decoded and 
encoded in polynomial time (using repeated squaring). Thus, an adaption of 
Lynch’s algorithm for FO^ can be performed in (Pr, const). □ 

Next we consider full first order logic on M-structures. 

Theorem 2. The language FOr of first order logic is data-complete in 

(Pr, const). It is expression-complete in DPATr, the class of problems decid- 
able in polynomial digital alternating time overR ([11])- 

Proof. Concerning data-complexity we again only have to show containment of 
the graph membership problem in (Pr, const). Since all real number databases 
considered here by convention are ordered one can cycle through all possible 
tuples of assignments of those variables quantified in the given formula 

p(o) = QiXiQ2X2...QkXk tp{xi,... ,Xk,a) , Qi G {V,3} and if G FO^. 

Since the number of quantifiers in p is fixed and since FOr quantifiers only 
range over the finite universe there me polynomially (namely n*’) many different 
assignments for the x,- only (where n denotes the size of the database). For each 
of them 'ijj{x,a) can be evaluated in (Pr, const) according to the last theorem. 
It remains to plug together the polynomially many truth values and to check 
whether they satisfy the quantifier prefix. Agmn an argument on the magnitude 
of integers coding all this information yields the claim: Given the quantifier 
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prefix consider all those subsets T of assignments for the Xj’s with the following 
property: p{a) is evaluated “true” if exactly for all x 6 T the sentence rp{x,a) 

ff 

is true. Such a T Ccin be coded by an integer < 2" and all such subsets can 
be represented via a single integer < 2^ . Since integers of such magnitude axe 
computable in time 0(n*) in the BSS model, all the information about the truth 
values of ip necessary to make p true can be coded in polynomial time in a single 
register. As above, it follows that for given (S, a) S) |= p{a) can be checked in 
(Pr, const). 

Next, consider a real number database S to be given. We want to show that 
deciding membership in GrpoR(®) is in class DPATr. To this aim we have to 
show (see [11]) the existence of a polynomial p and a BSS machine M such that 

(a,p) € GrFOR(S) 3yiVzi32/2Vz2 • • • ^yp(\p\)^ ^ p(\p\) ^ accepts the input 

(a,p,2/i,zi,... ,t/p(|p|),Zp(|^l)) in time < p(|pl) 

W.l.o.g. we assume the universe of 0 to be A {0, 1}. Let a formula p(a) = 
QiXi . . .QkXkip{a,x) 6 FOr of length n be given; clearly, k < n. Thus we can 
choose a linear polynomial p in the definition of DPATr satisfying the following: 
For any formula p of length n the machine M on input (a, p) first matches the 
quantifier shape Qi .. -Qk with a corresponding shape in 3j/i . . . Vzp(|p|) and then 
evaluates ip on (xi, . . . ,Xk,a,p) in Hnear time. 

Let us turn to completeness. The following problem QBSCP (quantified 
binary satisfiability circuit problem) is complete under (Pr, const) reduction in 
class DPATr: given an algebraic circuit B with n inputs and a quantifier prefix 
Qi ■ ■ Qn ^ {3, V}", decide whether QiXi . . . Qn^n B accepts (xi, . . . , x„). Here 
the Xi range over {0, 1}. Its completeness can be easily established in a similar 
manner the existence of a DNPR-complete problem is shown in [11]. Just note 
that the Lagrange polynomials used therein can be evaluated in (Pr, const). 
QBSCP can be obtained as a membership problem for a particular GrFOR(®o)- 
Choose a real number database Do {Ao,Pq) where the universe is Aq = {0, 1}, 
the finite structme associated with Aq is empty and Po consists of the constant 
nullary function 1 € R only. Any OBSCP instance can then be expressed as a 
membership task for GrFOR(®o) by representing the given algebraic circuit by 
a FO^ formula of polynomially bounded length. □ 

As shown in [17] over ordered R-structures first order fixed-point logic cap- 
tures Pr. This implies 

Theorem 3. The language FPr of first order fixed-point logic is data- complete 
in Pr and expression- complete in EXPr. 

Proof. For data-completeness membership in Pr is a consequence from [17]. 
Completeness follows by reduction from the problem SUBS of deciding, whether 
a system of quadratic polynomial equations and inequalities over R is solvable 
by substitution (see [13]). The PPT reduction algorithm used in [13] in order 
to establish PR-completeness of SUBS can be performed in (Pr, const) as well: 
the time used by each processor in those reductions is constant. 
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Concerning expression-complexity membership in EXPr is immediate by 
combining the arguments in the proofs of Theorem 1 and 2. Cycling through 
all possible assignments of the quantified variables works in time EXPr and the 
Scime holds true for evaluating a fixed-point construct. Completeness is estab- 
lished by reducing any EXPr problem to the membership problem for Grppi (Sq) 
where So is as in the proof of Theorem 2. This works exactly as above describing 
the actions of an EXPr machine M by a function Z : R for a suitable 

m € N being linear in the size of the current input for M . The steps performed by 
M as usually are simulated using the fixed-point rule. Since for a given problem 
in EXPr M is fixed this reduction works within complexity class (Pr, const). 
□ 



Consider once more Example 1. In our approach the problem is represented 
by a real number database 2) = (A,^) where A = {1, . . . ,n} and T consists of 
one function symbol X” : > R; here X{»,j) for fixed j represents the j-th 

point of the database IZ (for simplicity we have chosen here the dimension R" 
equal to the number of points. This can be generahzed without difllculty). Now 
the largest distance problem is expressed by a FP^ formula: for two points X{»,i) 
and X(»,j) € R" in 7?, (i.e. i,j £ A and X{k,i) is the fc-th component of X(»,i)) 
define a function which sums up the n components (X(k, i) - X(k,J))^, 1 < k < 
n. The latter function is describable as a fix-point. Thus the related query can 
be seen to be computable in polynomial time according to [17] just by analyzing 
its logical form. No quantifier elimination over a real closed field is involved in 
the logical description. 

We next turn to logics involving second order constructs. This means one can 
quantify over functions firom the finite universe to R. Thus second order logic is 
related to quantifier elimination problems for real closed fields and therefore to 
nondeterminism as defined in the BSS model. 

The following theorem and its corollary can be shown in a similar manner as 
the previous theorems. 

Theorem 4. The language 3SOr of existential second order logic is data-com- 
plete in NPr and expression-complete in NEXPr. □ 

The digital version of existential second order logic to capture the digital 
analogues of NPr and NEXPr. 

Corollary 1. The language 3DSOr of digital existential second order logic is 
data- complete in DNPr and expression-complete in DNEXPr. □ 

Finally, we treat SOr logic. 

Theorem 5. The language SOr of second order logic is data-complete in PHr 
and expression- complete in PATr, the class of problems decidable in polynomial 
alternating time over R. 

Proof. This is almost the Scime as that for Theorem 2. Now the second order 
quantifiers range over elements in R (more precisely; over function variables), the 
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reason why on the data-complexity side one jumps to the polynomial hierarchy. 
For expression-complexity it yields in replacing the binary quantifiers by “full” 
ones, i.e. switching from DPATr to PATr. □ 

We have seen that using descriptive complexity theory over the real num- 
bers as ground tool for dealing with queries to real number databases allows a 
uniform treatment both of data- and expression-complexity issues. A strong cor- 
respondence between the logic on R-structures necessary to describe a query and 
the complexity of computing the latter can be settled. Especially it is possible 
to treat queries for semi-algebraic (in the sense of constraint database theory) 
databases whose data belong to real Euclidean spaces of different dimension. 
The following table summarizes our results: 



Logic 


data-complete in 


expression-complete in 


FO„ 


(Pr, const) 


(Pr, const) 


FOr 


(Pr, const) 


DPATr 


FP^ 


Pr 


EXPr 


3DSOr 


DNPr 


DNEXPr 


3SOr 


NPr 


NEXPr 


SOr 


PHr 


PATr 



Similar results can be obtained for the other logics considered in [17] and [12]. 

We would like to stress that the basic concept of fc-ary generalized tuples 
introduced in [20] is captvued by our foregoing as well. If in the former setting a 
geometric object is given by a finite collection of quantifier free formulas, in the 
latter we can easily describe these formulas (resp. the polynomials involved) as a 
R-structure (see Example 2). Moreover, the size of this R-structure is determined 
by the important quantities influencing complexity issues, namely the number 
of variables (which has not to be fixed), the degree of the polynomials in the 
describing formula and the number of constraints. 
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Abstract. A real number is computable if it is the limit of an effectively 
converging computable sequence of rational numbers, and left (right) 
computable if it is the supremum (infimum) of a computable sequence 
of rational numbers. By applying the operations “sup” and “inf” alter- 
nately n times to computable (multiple) sequences of rational numbers 
■we introduce a non-collapsing hierarchy {En,IIn,An ; n € N} of real 
numbers. We characterize the classes i72,il2 and A 2 in various ways and 
give several interesting examples. 

Key -words. Real Numbers; Arithmetical Hierarchy; Recursively Ap- 
proximable Real Number. 



1 Introduction 

Every real number is a limit of some Cauchy sequence of rational numbers. By 
requiring both of the effectivity of the sequence and the effectivity of its conver- 
gence, we obtain the notion of computable real numbers. Namely, a real number 
a: e R is computable if there is a computable sequence (r„)„gN of rational num- 
bers which converges to x effectively, i.e., (Vn € N) (|x — r„| < 2~"). The class 
of all computable real numbers is denoted by ^i. Here effectivity of conver- 
gence is essential, because there are computable sequences of rational numbers 
which converge to non-computable real numbers. A standard example is given 
by Specker [13], i.e., a real number x[A\ 2“* for a non-recursive r.e. set 

A C N. In some sense, the real number x[A\ is still quite “effective” because we 
can enumerate all “1” positions in its binary expansion and hence there is an 
increasing computable sequence of rational numbers which converges to it. But 
x[A\ is not computable if A is not recursive. We call real numbers, which are the 
limits of increasing (decreasing) computable sequences of rational numbers, left 
(right) computable and denote by E\ (ili) the class of all left (right) computable 
real numbers. Notice, for an increasing (decreasing) sequence (r„)„gN of rational 
numbers, that lim„^c» Tn = sup„ r„ (lim„_>oo J’n = infn r„). 

By applying Shoenfield’s Limit Lemma [11] we show that x = x[A\ for some 
A 2 -set A iff r is the limit of a computable sequence of rational numbers. The 
class of all such real numbers is denoted by A 2 . In this case, neither monotonity 
of sequence nor effectivity of convergence can be guaranteed. As it is shown in 
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[16], the class A 2 extends the class I7i properly. It is well known in analysis 
that, a sequence converges if its limit superior and limit inferior are equal. As 
a kind of effectivization of this result, we show that x € A 2 if there are two 
computable sequences (xn)neN and (yn)n€N of rational numbers such that x — 

liminf„_+oo = limsup„_,oo yn- 

Although every sequence of rational numbers has a limit superior and a limit 
inferior, it is not always convergent. Then it is worth to ask, whether the limit su- 
perior (or inferior) of a computable sequence of rational numbers is always in ZI 2 ? 
The answer is no. Let E 2 (LI 2 ) be the set of all limsup„_,.oo (liminf„_yoo Xn) 
for some computable sequence (x„)„gN of rational numbers. Then E 2 and II 2 ex- 
tend the class A 2 properly. The key step to show this property is another descrip- 
tion of the class E 2 (LI 2 ): x £ ^ 2 ( 112 ) iflf there is a computable double sequence 
{xij)i,je^ of rational numbers such that x — supjinfjXjj (x = infj supj Xjj). 
We will see, that the radius of convergence of a power series 
some computable sequence (a„)neN of real numbers and the left endpoint of a 
computable Gi-interval of R are natural examples of the real numbers in ^ 2 - 

By applying the operations “sup” and “inf” alternately n times to com- 
putable (multiple) sequences of rational numbers we can introduce the classes 
En, Iln and An which form a non-collapsing hierarchy {I7„,77n,A„ : n E N} 
of (arithmetical) real numbers. This hierarchy corresponds to the arithmetic 
hierarchy of all arithmetical subsets of N. 

The outline of this paper is as follows. Section 2 is devoted to preliminaries. 
Section 3 and Section 4 give some technical lemmas about the relationships 
among the operations “lim”, “liminf”, “limsup”, “sup inf” and “inf sup” on 
(relative) computable sequences of rational numbers. In Section 5 we discuss 
the properties of approximable reed numbers, i.e., the real numbers in A 2 - Some 
interesting examples of real numbers in E 2 and IT 2 are given in Section 6. At last, 
we introduce the general arithmetical hiercurchy of (arithmetical) real numbers in 
Section 7. The hierarchy theorem and some other properties about the hierarchy 
are proved in this section. 



2 Preliminaries 

In this section we summarize some notations we use in this paper. Most of them 
are standard from reemsion theory [12,9] or from effective analysis [10, 14, 15]. 
Let A C N be a set of natiual numbers, (M^)egN an “effective” enumeration 
of all oracle Turing machines with oracle A and (i^^)e6N the corresponding 
“effective” enumeration of all (partied) A-computable functions from N to N 
such that ipf is computed by the e-th Turing machine Mf. By ^ we denote 
the s-th approximation of ipf which is defined by Peaix) := y, if e,x,y < s 
and the machine on input x computes y in at most s steps, and v?^s(x) is 
undefined otherwise. The set of all total A-computable functions from N to N 
is denoted by The jump A' of A is defined by A' := {n E N : ffnin) i}- 
The n-th jump A^") is defined inductively by A(”+^^ = (A^”^)'. Occasionally, we 
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identify a set S C N with its characteristic function. Then, S C N is Turing 
reducible to A (denoted by B <7 A), iff there is an e G N such that B — ip^. 
Let (■, •) : t N be the Cantor pairing function defined by (i,j) := 

+ j){i + j + 1) + j and let tti, 7 T 2 : N -t N the first and second projection of 
its converse. Notice that the pairing function is strictly monotonic in both argu- 
ments. For any n G N, let D„ {m2“" : m G N} and D := UneN®"- Then D is 
the set of all finite dyadic rational numbers. For the set Q of rational numbers, 
we define an effective numbering i/q : N Q by J, k)) {i — j)/{k -b 1). 
A function / : N" — Q is called A-computable if there is an A-computable func- 
tion 5 : N" N such that f{h, ,in) = o g{ii, , in)- The set of all total 
A-computable functions from N” (for some n G N) to Q and to D is denoted by 
Fq and F^, respectively. 

By definition, a sequence (r„)„gN on X is a function r : N X and a 
double sequence (rij)ij^M is a function r : N x N — AT. A sequence (r„)„gN 
of real numbers is A-computable if there is an A-computable double sequence 
(j'nm)nmeN of rational numbers which converges nniformly and effectively to 
(r„)„gNi be., there is a reclusive function e : FsF — N such that 

(Vn, N,k eN) {k > e(n, N) => |r„fc - x„| < 2~^) 

We will say that a sequence (rn)n€N of real numbers converges to x effectively 
if (Vn G N)(|r -x„| <2“"). 

We recall the definition of the arithmetical hierarchy of subsets of N. Bq = 
ITq = is the set of all recursive subsets of N. For n > 1 and any set A C N, 
A G iff there is a recursive set ii C N such that, for any i G N, 

ieA (3mi)(Vni2)(Bm3) ■ ■ ■ (Qm„)((i,mi,- ■ ■ ,mn) € R), 

where Q is “3” if n is odd, and “V” otherwise. A G IT^, if the complement A of 
A is in is defined as PI i7®. By Kleene’s Hierarchy Theorem [7], we 

have C 27° and C iT° for all n > 0. A set A C N is called i7°-complete 
(J7°-complete) if A G 27°(i7°) and B <t A for every B G B°{F[n). The set 
i.e., the n-th jump of the empty set 0, is a 27°-complete set. In particular, 
0 (") ^ ^0 for all n > 0 . 

For any set A C N, we define its binary real number by x[A] := 2“*. 

A real function / : [a, 6 ] — ^ R is called computable, if 

1 . / is sequentially computable, i.e., / maps every computable sequence 
(x„)„gN of real numbers to a computable sequence (/(xn))neN; 

2 . / is effectively uniformly continuous, i.e. there is a recursive function d : N — > 
N snch that 

(Vx,y G [a, 6 ])(Vn G N)(|x - y\ < 2-‘^(") |/(x) - f{y)\ < 2""). 

Particularly, it follows from 1. that computable real function maps all com- 
putable real numbers to computable rccd numbers. 
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3 Limit Superior and Limit Inferior 

In this section we list three technical lemmas about the relationships among the 
“hm”, “hmsup”, “liminf”, “sup inf’ and “inf sup” of A-computable sequences 
of rational niunbers. 

Lemma 1. Let / : -> Q 6e an A-computable function. If supj infj f{m, i,j) 

exists, then there is an A-computable function h : ^ Q such that, for any 



m € N, the following conditions hold: 

'ii,j e N(h(m,i,j) > h{m,i,j + 1)) (1) 

Vi G N({j € N : h{m,i,j) h{m,i,j + 1)} is finite) (2) 

inf /i(m, i, j) < inf + 1, j) (3) 

i 3 

sup inf h{m, i,j) = sup inf /(m, i,j). (4) 

i 3 i 3 



The corresponding result for “inf sup ” instead of “sup inf ” holds accordingly. 

Lemma 2. For every sequence (xTO)m€N of real numbers, the following proper- 
ties are equivalent: 

1. There is some A-computable function f such that 

Xm = sup inf f{m,i,j) for all m G N, 

» 3 

2. There is some A-computable function g such that 

Xm - lim inf g{m, i) for all m G N. 

t— foo 

The result holds accordingly for infjsupj and limsupj_^(,o instead of supjinfj 
and liminfj_yoo, respectively. 

Lemma 3. For every sequence (xm)meN of real numbers, the following proper- 
ties are equivalent: 

1. There is some A-computable function f such that 

Xm = lim /(m, i) for all m G N, 

1—^00 

2. There are A-computable functions g< and g> such that 

Xm = lim inf (m, i) = hm sup (m, i) for all m G N, 

f^OO 

3. There are A-computable functions h< and /i> such that 

Xm = supinf h<(m, i,j) = inf sup/i>(m, i, j) forallm&N. 
i 3 i j 
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4 Limit Lemmas 

In recursion theory, Shoenfield’s Limit Lemma [11] says that, a function / : N — t 
N is j4'-computable iff there is an A-computable function ^ ^ N such that 

f{n) — limrn-yoo5("i)»i)- This holds obviously for the functions from N to Q in 
the following sense. 

Lemma 4 (Shoenfield’s Limit Lemma [11]). A function / : N -> Q 
is A' -computable iff there is an A-computable function g : Q such that 

f(n) — lirnm-voo 5 (?n, n) and g{m,n) = g{m + l,n) for almost all m. 

In this section, we show some other kinds of “hmit lemmas” which relate the 
A'-computable fimctions with A-computable functions from natural numbers to 
rational numbers and hold for “sup” and “inf” as well. 

Lemma 5. Let A C N and f : —¥ Q be an A-computable function. Then 

there are A' -computable functions g\,g 2 , 9 s : > Q such that 



sup inf /(m, i, j) - sup gi (m, i); 

i 0 i 


(5) 


infsup/(m,i,j) = infg 2 {m,i); 

i j t 


(6) 


lim lim f{m,i,j)= lim 53 ( 771 , 1 ), 

i-¥oo j-^oo i-¥oo 


(7) 



if the left parts of the equations exist respectively. 

Lemma 6. Let A C N and f : Q be an A' -computable function. Then 

there are A-computable functions gi,g 2,93 : t Q such that 



sup f{m, i) = sup iiff 51 (m, i,j)-, 

i i 3 


( 8 ) 


inf/(m,i) = inf sup 52 ( 771 , 7 , 5 ); 
* ’ j 


(9) 


lim f{m,i) = lim lim gs{m,i,j), 


(10) 



i — >00 »— > 00^-400 



if the left parts of the equations exist respectively. 

5 Recursively Approximable Real Numbers 

A real number x is called left computable {right computable) if there is a com- 
putable increasing (decreasing) sequence (r„)„gN of rational numbers which con- 
verges to X. Left and right computable real numbers are called semi-computable. 
The next proposition follows immediately from the definition. 

Proposition 1. A real number x € R is left (right) computable iff there is a 
computable function / : N -4 Q such that x = sup^ f{i) (x = infr f{i) ). 
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We denote by Si (iTi) the set of all left (right) computable real numbers. 
Then — Sif] 77i is the set of all computable real numbers. For example, if 
(o, h) is an r.e. open set, i.e., it is an union of a computable sequence of rational 
open intervals, then a is right computable and b is left computable. 

A real number x is called recursively approximable (r.a. for short), if there 
is a computable sequence (r„)„gt>j of rational numbers which converges to x. 
Then, every semi-computable real number is also recursively approximable. The 
converse is not true as shown in [16]. In this section we will discuss some basic 
properties of r.a. real numbers. The next proposition shows that the class of all 
r.a. real numbers is closed under the operations of “lim”, computable mapping 
and arithmetical operations. 

Proposition 2. 1. If (a;„)„gN is a computable sequence of real numbers and 
X = lim„_yoo Xn, then x is r.a.; 

2. If f : R R is a computable real function and x £ dom(/) an r.a. real 
number, then f(a) is also r.a.; 

3. If x,y £R are r.a., then x + y,x — y,xxy and x -£y (y 0) are also r.a. 
That is, the class of all r.a. real numbers is closed under the arithmetical 
operations -h, — , x and -r, hence is a closed field.. 

Ho [5] calls a real number a; € R A-computable if there is an A-computable 
function / : N ^ Q such that |/(n) -x| < 2~" for all n € N. Then he shows that 
X € M is r.a., iff x is 0'-computable. We will give two further characterizations 
of r.a. real numbers in this section. The next theorem shows that a real number 
is r.a. iff it can be expressed both as the limit superior and limit inferior of 
some computable sequences of rational numbers. This result is very similar to 
the well-known result that a real nmnber is computable iff it is both left and 
right computable. The proof follows immediately from Lemma 3 and Lemma 2. 

Theorem 1. A real number x is r.a. iff there are computable sequences (xj)jgN 
and (2/i)igN of rational numbers such that liminfi_nx> == hm supi_^oo 2/i 
.r, hence iff there are computable functions /,5 : t Q such that x = 

infi supj- f{i,j) = supi inf_,- g{i,j). 

The second characterization considers the binary expansion of the real num- 
ber. 

Lemma 7. Let A C N and / : N 1D> D [0, 2] be an A-computable function such 
that lim„_>oo f{n) exists. Then there is a set B C N such that B <t A! and 
lim„^oo/(n) - x[Hj. 

Theorem 2. A real number x £ [0,2] is r.a. iff there is a set A £ zl® such that 

X = x[A\. 

Because of the Theorem 2 we will call a r.a. real number A 2 -computable and 
the set of all recursively approximable real numbers is denoted by A 2 - 
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6 E2- and II2- Real Numbers 

In Section 5 we have seen that any r.a. real number is the limit superior (and 
limit inferior) of a computable sequence of rational numbers. We will show in 
this section that the converse is not true. 

Theorem 3. There is a computable sequence (xjj)jjgN of rational numbers such 
that a := infj supj X{j is not r.a. Hence, there is a computable sequence (a;„)„gN 
of rational numbers such that its limit superior limsup„_^oj, x„ is not r.a. 

Proof. Let A := {i € N : <pi is a total function}. Then A is a ilj-complete set, 
hence it is not a Zl^-set. It follows from Theorem 2 that the real number x[A] is 
not r.a. 

Define a sequence (xij)ij^t^ of rational numbers by 

■= :eeNkVy<i (<Pe,j(y) 4-)}- (H) 

for all 6 N. Obviously, (xjj)ijgN is a computable sequence which satisfies 

inf supxij = inf sup : e G N & Vy < i ((fiej(y) •})} 

^ j ' j 

= inf : e € N & 3 jWy < i {ipe,j{y) 4-)} 

= ^{ 2 -^:eeNkVi 3 jMy) ^)} 

€ N & is a total function} = x[A\. 

Therefore infj supj- X(j is not r.a. 

The second assertion of the theorem follows immediately from the Lemma 2 

The result of Theorem 3 holds eiccordingly for limit inferior and “sup inf” . 
We define E2 {II2) to be the class of reed number x G N such that there is a 
computable function / : -)■ Q with x = supj infj f{i,j) (x = infj sup^ f{i,j)). 

They are the natural extensions of A2 and the extensions are proper by Theorem 
3, i.e., A2 C S2 and A2 C H2. Furthermore, it follows from Lemma 3 that 
A 2 = L "2 n 172- In the following, we give some examples of real numbers in U 2 
and II2. 

Let / :C R ^ R be a real function defined by a computable power series, 
i.e., /(x) := 52^0 and (a„)„gN is a computable sequence of real numbers. 
Then, it is shown by Mazur [8], that / is computable on any closed interval 
[— r + S,r — (5], where r is the radius of convergence of the power series and 
(5 > 0. So it suffices to determine the radius of convergence of this power series, 
if we want to know the domain of / on which / is computable. Prom analysis 
it is well known that the power series ^^0 ^ radius of convergence 

r := l/limsup„_^oo ^|a„|. We call a power series 13^0 computable if its 
coefficients form a computable sequence (a„)„gN of real numbers. Then, from 
Theorem 3, Lemma 2 and the fact that x is r.a., it follows immediately the 
following theorem. 
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Theorem 4. 1 . If is a computable power series with radius of 

convergence r < oo, then r £ S2; 

2 . There is a computable power series “nS:" such that its radius of con- 
vergence is not r.a. 

Next, we give some examples of S2 and II2 real numbers which are related 
to some special points of computable G^-subsets of M. 

Definition 1. A set ^ C R is a computable Gs-set if there is a computable 

sequence of rational open intervals such that A = HieN UjeN 

Definition 2. Let A C K be any set of real numbers. A real number o e R is 
called a left cut-point (or right cut-point) of A if there are real numbers ci , C2 € R 
such that Cl < a < C2 and 

(ci, a) n A = 0 & (a, C2) C A (12) 

( or (ci , a) C A = Sz (a, C2) Pi A = 0.) (13) 

For example, the left and right endpoints of an (open or closed) interval are left 
and right cut-points, respectively. 

Theorem 5. If A CR is a computable Gs-set and o € R o left (right) cut-point 
of A, then a £ £’2(772)- Furthermore, a £ 77i(£i), if a ^ A. 

Proof. Let A C R be a computable G^-set, i.e., there is a computable sequence 
{Iij)ij€N of rational open intervals such that A = flieN UjeN •= 
UjeN i € N. Let a € R be a left cut-point of A, i.e., there are two real 

numbers Ci,C2 such that Ci < a < C2 which satisfies (12). Assume w.l.g. that 
Cl , C2 are both rational numbers. 

Let Jij := lij n (ci, C2) for i,j £ N. Then Ai := A n (ci, C2) = fligN UjgN 
is also a computable G^-set. Fix a rational number b £ (0,02) and define Vij 
to be the longest rational open interval I which satisfies that b £ I and I C 
U<1 Suppose th&t ViJ = (ujj,Ujj). Then and (r^ij)i,jgN are all 

computable sequences of rational numbers which are non-increasing and non- 
decreasing, respectively. So, Ui := infjgNWtj and Vi := supjgpjUij are right and 
left computable, respectively. Since (a, C2) C we have Ui < a and 

C2 < Vi for all i € N. It follows that supj-gPi inf^gN Uij = supjg^ < a. 

Now we show that sup^gj^ infjgN Uij > a. Assume by contradiction that there 
is an r 6 R such that sup^gj^ inf jgN Uij < r < a. This implies that r £ (JjgN — 
Ui for all i £ N, hence r £ HtgN ~ contradicts to the hypothesis that 

a is a left cut-point of A. 

Therefore supjgp^ infjgN uij = a and hence a is £2-computable. 

Now suppose that o ^ A, hence a ^ Ai. Notice that (0,02) C UjgN^b = 
{ui,Vi). li Ui <a for all i £ N, then a £ f|(wi,i^i) = fligN ^ fligN Ui = A. A 
contradiction. So there must be some i such that a = Ui. This means that o is a 
right computable real number, i.e., hence a£ IIi. 

Corollary 1, Let A C R 6e a computable Gs-set. Then the followings holds. 
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1. A. ~ [u, 6] ■ " / ■ CL € Jj2 & 6 G Il2t 

2t, A ~ [fl, 6) V Cl G ^2 & G ^1/ 

3. A = (a, 6] a S IIi b G II 2 ! 

A = (u, 6) V fl G U\ ^ b G Ui . 



7 Arithmetic Hierarchy of Real Numbers 



In preceding sections we have discussed the classes Ai, Si, Ui, A2, S2 and 
II2 of real numbers which are characterized by different “limit properties” of 
computable sequences of rational numbers, respectively, as follows: 

X G Si {III) X = sup/(n) (x = inf f(n)) 

n ^ 

a: G X'2 (H2) X — sup inf g{n,m) {x = inf sup g{n,m)) 

n m 

X G Ai <=> X = lim /(n) effectively 

n— >00 

X G A2 X — lim /(n), 

n— j’OO 

where / : N — )• Q and 5 : — )■ Q cire computable functions. These classes 

satisfy the following properties: Ai ^ Si U Hi C A2 C S2 U Il2- We extend 
this procedure to introduce an infinite hierarchy of (subset of) the real numbers. 
In the following 6>i„ will denote “supj^” if n is odd, and “infj„” if n is even, 
and will denote “infi^” if n is odd, and “sup^^” if n is even. Although the 
definition seems quite natural, the authors have not found it in the literature 
explicitly. 

Definition 3. 1. i7o = iTo = ^0 := {a: G K : x is computable}; 

2. For 71 > 0, 



r„ := {x G R : (3/ G rQ)(x = sup^^ infjj supjg ■ • ■ 6i„f{ii, • • • , i„))}; 
i7„ := {x G R : (3/ € rQ)(x = infj, supj^ infjj • • • &i„f{ii, • • • , i„))}; 

3. An := Sn Pi Iln- 

If X G Sn {Iln, An), then we also say that x is Sn {!!„, An) -computable. 

Obviously, x G H'n iff — x G iT„ for ziny real number x. The next lemma gives 
another characterization of these classes which are the generalizations of Propo- 
sition 1, (1) of Proposition 2 and Theorem 1. 

Lemma 8. For any n > 1 and any real number x € R, 



X e Sn+i (3/ G Fq "’) (x = sup /(f)) (14) 

t€N 

X e iT„+i ^ (3/ G r^‘"') (x = inf /(f)) (15) 

^ *€N 

X G An+I (3/ e To'"*) (x = lim /(*) effectively) (16) 

^ i— foo 

X S Zl„+2 ^ (3/ €^'"’) (x = lim /(f)) (17) 
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Remark: It is not difficult to see from (16) and (17) that the class An+i 
and An +2 are the classes £)t”] (of [n]-constructive real numbers) and (of 
[n]-constructive pesudonumbers,) respectively, introduced by O. Demuth (cf [2]). 
Notice that a real number a: € R is called arithmetical (cf. [6],) if there is an 
A-computable function / : N — > Q, for some arithmetical set C N such that 
lim„_,oo f(n) — X effectively. Then UngN arithmetical real 

numbers. 

Corolleiry 2. For any x 6 R and n > 1, x G .^n+i iff there is a computable 
function / : N" — >■ Q such that 

x= hm ••• Um /(ii, . . . , i„). (18) 

tl— ►OO 

Definition 4. A sequence (xs)^^^ of real numbers is called I7„ -computable, 
77„-computable and -computable, if there are computable functions f,g,h : 
fsf>+i respectively, such that, for aU s € N, 

Xs = supjj infij • • • 9i„f{s, ii, . . . ,in) 

Xs = infi^ supij • • • Oi„g(s, ii, . . . , z„); 

Xs = lim ••• lim /i(s,ii, . . . ,z„). 

»1— ►OO in'^OO 

Proposition 3. For any x G R and n > 1, 

1. X E Sn <=> there is a An-computable sequence (xs)seN of real numbers 

such that X = sup^ x*; 

2. X E Tin there is a An-computable sequence (Xg)sgN of real numbers 

such that X = infj x^; 

3. X E An+i there is a An-computable sequence (xa)^^^ of real 

numbers such that x = lims_Kx> x,- 

Proposition 4. For any n G N, 

1. An is closed under the arithmetical operations of addition, subtraction, mul- 
tiplication and division. That is. An is an algebraic field. 

2. // / : R — z R is a computable function and x E An, then f{x) E An. 

Obviously, Proposition 4 does not hold for the classes En and Tin instead of 
An. For example, if x G En\An, then — x ^ En, although the function / : R — > R 
defined by /(x) := — x is computable. 

Lemma 9. For any set A C N and any n G N, if A E E^ thon x[A] E En 

{Hn). 

Notice that the converse of above Lemma 9 is not true, because Jockusch has 
observed that there is a non-r.e. set A C N such that x[A] is still I7i-computable. 

Theorem 6. For any set A C N, A G A” iff x[A] E A„. 
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Proof. We need only consider the case of n > 2. If A e the x[A\ G by 
Lemma 9. If x[A] G An, by Lemma 8, there is a 0("“^)-computable function 
/ ; N — ^ Q such that x[A\ = lim„_>oo /(ti)- Assume w.l.g. that /(n) G D fl [0, 2]. 
Then, by Lemma 7, there is a set A <t hence A G A^ by Post Theorem, 

such that X = x[A]. 

Prom Theorem 6 and Lemma 9, it follows immediately the hierarchy theorem 
of the arithmetical hierarchy {i7„, 77„, zl„ : n G N} of arithmetical real numbers 
as follows. 

Theorem 7 (Hierarchy Theorem). For n > 1, A„ C En and A„ C J7„. 

Proof. Let A := 0^"^ Then A G . By Theorem 9 and Theorem 6, x[A] G 

En\An- Similarly, we can show that x[A\ G 
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Abstract. In this paper we describe the Burrows- Wheeler Transform 
(BWT) a completely new approach to data compression which is the 
basis of some of the best compressors available today. Although it is 
easy to intuitively understand why the BWT helps compression, the 
analysis of BWT-based algorithms requires a careful study of every single 
algorithmic component. We describe two algorithms which use the BWT 
and we show that their compression ratio can be bounded in terms of the 
k-th order empirical entropy of the input string for any k >0. Intuitively, 
this means that these algorithms are able to malce use of all the regularity 
which is in the input string. 

We also discuss some of the algorithmic issues which arise in the com- 
putation of the BWT, and we describe two variants of the BWT which 
promise interesting developments. 



1 Introduction 

It seems that there is no limit to the amount of data we need to store in our 
computers, or send to oin friends and colleagues. Although the technology is 
providing us with larger disks and faster communication networks, the need 
of faster and more efficient data compression algorithms seems to be always 
increasing. Fortunately, data compression algorithms have continued to evolve 
in a continuous progress which should not be taken for granted since there are 
well known theoretical limits to how much we can squeeze our data (see [31] for 
a complete review of the state of the art in all fields of data compression). 

Progress in data compression usually consists of a long series of small im- 
provements and fine tuning of algorithms. However, the field experiences occa- 
sional giant leaps when new ideas or techniques emerge. In the field of lossless 
compression we have just witnessed to one of these leaps with the introduction 
of the Burrows- Wheeler Transform [7] (BWT from now on). Loosely speaking, 
the BWT produces a permutation bw(s) of the input string s such that firom 
bw(s) we can retrieve s but at the same time bw(s) is much easier to compress. 
The whole idea of a transformation that makes a string easier to compress is 
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completely new, even if, after the appearance of the BWT some researchers rec- 
ognized that it is related to some well known compression techniques (see for 
example [8, 13, 18]). 

The BWT is a very powerful tool and even the simplest algorithms which 
use it have surprisingly good performemces (the reader may look at the very 
simple and clean BWT-based algorithm described in [26] which outperforms, 
in terms of compression ratio, the commercial package pkzip). More advanced 
BWT-based compressors, such as bzip2 [34] and szip [33], are among the best 
compressors currently available. As can be seen from the results reported in [2] 
BWT-based compressors achieve a very good compression ratio using relatively 
small resources (time and space). Considering that BWT-based compressors are 
still in their infancy, we believe that in the next future they are likely to become 
the new standard in lossless data compression. 

In this paper we describe the BWT and we explain why it helps compression. 
Then, we describe two simple BWT-based compressors and we show that their 
compression ratio can be bounded in terms of the empirical entropy of the input 
string. We briefly discuss the algorithms which are currently used for computing 
the BWT, and we conclude describing two recently proposed BWT variants. 



2 Description of the BWT 

The Burrows- Wheeler transform [7] consists of a reversible transformation of 
the input string s. The transformed string, that we denote by bw(s), is simply 
a permutation of the input but it is usually much easier to compress in a sense 
we will make clear later. The transformed string bw(s) is obtained as follows^ 
(see Fig. 1). First we add to s a unique end-of-file symbol •. Then we form a 
(conceptual) matrix containing all cyclic shifts of s*. Then, we sort the rows of 
this matrix in right-to-left lexicographic order (considering • to be the symbol 
with the lowest rank) and we set bw(s) to be the first column of the sorted matrix 
with the end-of-file symbol removed. Note that this process is equivalent to the 
sorting of s using, as a sort key for each symbol, its context, that is, the set 
of symbols preceding it. The output of the Burrows- Wheeler transform is the 
string bw(s) and the index / in the sorted matrix of the row starting with the 
end-of-file symbol^ (for example, in Fig. 1 we have 7 = 3). 

Although it may seem siuprising, from bw(s) and I we can always retrieve s. 
We show how this can be done for the example in Fig. 1. By inserting the 
symbol • in the 7th position of bw(s) we get the first column F of the sorted 
cyclic shifts matrix. Since every column of the matrix is a permutation of s», 
by sorting the symbols of F we get the last column L of the sorted matrix. 
Let Si (resp. Fi,Li) denote the i-th symbol of s (resp. F,L). The fundamental 

^ To better follow the example the reader should think s as a long string with a strong 
structure (e.g., a Shakespeare play). Data compression is not about random strings! 
^ In the original formulation rows are sorted in left-to-right lexicographic order and 
there is no end-of-file symbol. We use this slightly modified definition since we find 
it eaisier to understand. 
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mississippi* 

ississippiam 

ssissippiami 

sissippiamis 

issippiamiss 

ssippiamissi 

sippiamissis 

ippiamisslss 

ppiamississi 

piamississip 

iamississipp 

amississippi 



m ississippia 
s sissippiami 
a mississippi 
s sippiamissi 
p piamississi 
i ssissippiam 
p iamississip 
i amississipp 
s issippiamis 
s ippiamissis 
i ssippiamiss 
i ppiamississ 



Fig. 1. Example of Burrows- Wheleer transform. We have Tov{mississippi) = 
msspipissii. The matrix on the right is obtained sorting the rows in right-to-left lexi- 
cographic order. 



observation is that for i = 2, . . . , |s| 4-1, Fj is the symbol which follows Li inside 
s. This property enables us to retrieve the string s symbol by symbol. Since 
we sort the cyclic shifts matrix considering a to be the symbol with the lowest 
rank, it follows that fi is the first symbol of s. Thus, in our example we have 
si = m. To get «2 we notice that m appears in L only in position 6, thus from 
our previous observation we get S2 = Fe = i. Now we try to retrieve S3 and 
we have a problem since i appears in L four times, in the positions L2, . . . 
Hence, F2,... ,Fs are all possible candidates for S3. Which is the right one? 
The answer follows observing that the order in which the four i’s appear in F 
coincides with order in which they appear in L since in both cases the order 
is determined by their context (the set of symbols immediately preceding each 
i). This is evident by looking at sorted matrix in Fig. 1 . The order of the four 
i’s in L is determined by the lexicographic order of their contexts (which are 
m, mississlpp, miss, mississ). The same contexts determine the order of the 
four i’s in F. When we are back-trzmsforming s we do not have the complete 
matrix and we do not know these contexts, but the information on the relative 
order still enable us to retrieve s. Continuing om: example, we have that since 

52 — Fe is the first i in F we must look at the first i in L which is L2. Hence, 

53 = F2 = s. Since S3 is the first s in F, it corresponds to Lq and S4 = Fg = s. 
Since Fg is the third s in F it corresponds to L\i aud we get S5 — Fn = i. The 
process continues until we reach the end-of-file symbol. 

Why should we caxe about the permuted string bw(s)? The reason is that the 
string bw(s) has the following remarkable property: for each substring w of s, the 
symbols following in in s are grouped together inside bw(s). This is a consequence 
of the fact that all rotations ending in w are consecutive in the sorted matrix. 
Since, after a (suflBciently large^) context w only a few symbols are likely to be 

® How large depends on the structure on the input. However (and this is a fundamental 
feature of the Burrows- Wheeler transform) our argument holds for any context w. 




The Burrows- Wheeler Transform: Theory and Practice 



37 



seen, the string bw(s) will be locally homogeneous, that is, it will consist of the 
concatenation of several substring containing only a few distinct symbols. 

To take advantage of this particular structure. Burrows and Wheeler sug- 
gested to process bw(s) using Move-to-Pront recoding [6,27]. In Move-to-Pront 
recoding the symbol Oj is coded with an integer equal to the number of distinct 
symbols encountered since the previous occmrence of a». In other words, the 
encoder maintains a list of the symbols ordered by recency of occurrence (this 
will be denoted the mtf hst). When the next symbol arrives, the encoder outputs 
its current rank and moves it to the front of the list. Therefore, a string over the 
alphabet A = {oi, ... , o/i} is transformed to a string over {0, . . . , h — 1} (note 
that the length of the string does not change)^. 

So far we still have a string s — mtf (bw(s)) which has exactly the same length 
as the input string. However, the string s will be in general highly compressible. 
As we have already noted, bw(s) consists of several substrings containing only 
a few distinct symbols. The Move-to-Pront encoding of each of these substrings 
will generate many small integers. Therefore, although the string s = mtf (bw(s)) 
contains enough information to retrieve s, it consists mainly of small integers, (if 
we start with an English text, s usually contains more than 50% O’s). The actual 
compression is performed in the final step of the algorithm which exploits this 
“skeweness” of s. This is done using for example a simple zeroth order algorithm 
such as Huffman coding [36] or arithmetic coding [38]. These algorithms are 
designed to achieve a compression ratio equal to the zeroth order entropy of the 
input string s which is defined by® 

= ( 1 ) 

where n = |s| and Ui is the number of occurrences of the symbol a, in s. Ob- 
viously, if s consists mainly of O’s and other small integers Hq{s) will be small 
and s will be efficiently compressed by a zeroth order algorithm. 

Note that neither Huffman coding nor arithmetic coding are able to achieve 
compression ratio Hq{s) for every string s. However, arithmetic coding can get 
quite close to that: in [15] Howard and Vitter proved that the arithmetic coding 
procedmre described in [38] is such that for every string s its output size Arit(s) 
is bounded by 



Arit(s) < \s\Ho{s) + ;xi]s] + fi 2 (2) 

with /ii « 10~^. In other words, the compression ratio Arit(s)/|s| is bounded 
by the entropy plus a small constant plus a term which vanishes as |s| ^ oo. 

In the following we denote with BWO the algorithm bw -I- mtf -f Arit, that is, 
the algorithm which given s returns Arit(mtf (bw(s))). BWO is the basic algorithm 

Obviously, to completely determine the encoding we must specify the status of the 
mtf list at the beginning of the procedure. 

® In the following all logarithms are taken to the base 2, and we assume 0 log 0 = 0. 
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described in [7] and it has been tested in [13] (under the name bs-OrderO). Al- 
though it is one of the simplest BWT-based algorithms, it has better performance 
than gzip which is the current standcurd for lossless compression. 

Note that the other BWT-based compressors described in the hterature do 
not differ substantially from BWO. Most of the differences are in the last step of 
the procedure, that is, in the techniques used for exploiting the skeweness of 
s = mtf (bw(s)). A technique commonly used by BWT compressors is run-length 
encoding which we will discuss in the next section. Move-to-Front encoding is 
used by all BWT compressors with the only exceptions of [4] and [33] which use 
slightly different encoding procedures. 



3 BWT compression vs entropy 

It is generally known that the entropy of a string constitutes a lower bound 
to how much we can compress it. However, this statement is not precise since 
there are several definitions of entropy, each one appropriate for a particular 
model of the input. For example, in the information theoretic setting, it is often 
assumed that the input string is generated by a finite memory ergodic source S, 
sometimes with additional properties. A typical result in this setting is that 
the average compression ratio® achieved by a certain algorithm approaches the 
entropy of the source as the length of the input goes to infinity. This approach 
has been very successful in the study of dictionary based compressors such as 
lz77, lz78 and their variants (which include gzip, pkzip, and compress). 

Compression algorithms can be studied also in a worst-case setting which 
is more familiar to people in the computer science community. In this setting 
the compression ratio of an algorithm is compared to the empirical entropy of 
the input. The empirical entropy, which is a generalization of (1), is defined in 
terms of the number of occurrences of ecich symbol or group of symbols in the 
input. Since it is defined for any string without any probabihstic assumption, 
the empirical entropy naturally can be used to establish worst case results (that 
is, results which hold for every possible input string). 

The first results on the compression of BWT-based algorithms have been 
proved in the information theoretic setting. In [28, 29] Sadakane has proposed 
and analyzed three different algorithms based on the BWT. Assuming the input 
string is generated by a finite-order Markov source, he proved that the average 
compression ratio of these algorithms approaches the entropy of the source. More 
recently, Effros [10] has considered similar algorithms and has given bounds on 
the speed at which the average compression ratio approaches the entropy. Al- 
though these results provide useful insight on the BWT, they are not completely 
satisfying. The reason is that these results deed with algorithms which are not 
realistic (and in fact are not used in prcictice). For example, some of these algo- 
rithms require the knowledge of quantities which are usually unknown such as 
the order of the Markov source or the number of states in the ergodic source. 

® The average is computed using the probability of each string of being generated by 
the source S. 
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In the following we consider the algorithm BWO = bw -h mtf + Arit described 
in the previous section and we show that for any string s the output size BWO(s) 
can be boimded in terms of the empirical entropy of s. To our knowledge, this is 
the first analysis of a BWT-based algorithm which does not rely on probabilistic 
assumptions. The details of the analysis can be found in [21]. 

Let s be a string of length n over the alphabet A — {ai, . . . , a/i}, and let rii 
denote the number of occurrences of the symbol aj inside s. The zeroth order 
empirical entropy of the string s is defined by (1). The value |sjLfo(s)> represents 
the output size of an ideal compressor which uses — log ^ bits for coding the 
symbol aj. It is well known that this is the maximum compression we can achieve 
using a uniquely decodable code in which a fixed codeword is assigned to each 
alphabet symbol. We can achieve a greater compression if the codeword we use 
for each symbol depends on the k symbols preceding it. For any length-fc word 
w G A^ let Wa denote the string consisting of the characters following w inside 
s. Note that the length of Wa is equcd to the number of occurrences of w in s, or 
to that number minus one if u; is a s uffix of s. The value 

= n E K|//oK) (3) 

is called the k-th order empirical entropy of the string s. The value |sjif*(s) 
represents a lower bound to the compression we can achieve using codes which 
depend on the k most recently seen symbols. Not surprisingly, for any string s 
and A: > 0, we have Hk+i{s) < Hk{s). 

Example 1. Let s = mississippi. Prom (1) we get Hq{s) m 1.823. For fc = 1 
we have = i, = ssp, = sisi, p, = pi. By (1) we have Ho{i) — 0, 
ffo(ssp) = 0.918, Ho{sisi) = 1, Lfo(pi) = 1. According to (3) the first order 
empirical entropy is //’i(s) « 0.796. □ 

We now show that the compression achieved by BWO can be bounded in terms 
of the empiriccd fc-th order entropy of the input string for any k >0. From (3) 
we see that to achieve the fc-th order entropy “it suffices”, for any w G A'^, 
to compress the string Wa up to its zeroth order entropy Ho{wa). One of the 
reasons for which this is not an easy task is that the symbols of Wa are scattered 
within the input string. But this problem is solved by the Burrows- Wheeler 
transform! In fact, from the discussion of the previous section we know that 
for any w the symbols of Wa are grouped together inside bw(s). More precisely, 
bw(s) contains as a substring a permutation of Wg. Permuting the symbols of a 
string does not change its zeroth order entropy. Hence, thanks to the Burrows- 
Wheeler transform, the problem of achieving Hk{s), is reduced to the problem 
of compressing several portions of bw(s) up to their zeroth order entropy. Note 
that even this latter problem is not an easy one. For example, compressing bw(s) 
up to its zeroth order entropy is not enough as the following example shows. 

Example 2. Let si = o"6, S 2 = 6"a. We have |si|iLo(si) + |s 2 |Lfo(s 2 ) ^ 21ogn. 
If we compress the concatenation siS 2 up to its zero order entropy, we get an 
output size of roughly |siS 2 |iLo(siS 2 ) = 2n -I- 1 bits. □ 
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The key element for achieving the A:-th order entropy is mtf encoding. We have 
seen in the previons section that processing bw(s) with mtf produces a string 
which is in general highly compressible. In [21] it is shown that this intuitive 
notion can be transformed to the following quantitative result. 

Theorem 1. Let s be any string over the alphabet {cci, ... , a/,}, and s = mtf (s). 
For any partition s = Si ■ ■ ■ St we have 

' * 1 2 

|s|i/o(s) < 8 + — |s| + t(2/ilogh + 9). (4) 

■i=l 

□ 

The above theorem states that the entropy of mtf (s) can be bounded in terms 
of the weighted sum 

lf^Fo(si) + if^£ro(s2) + --- + ^i?o(st) (5) 

|S| |S| |S| 

for any partition siS 2 St of the string s. We do no claim that the bound 
in (4) is tight (in fact, we believe it is not). In most cases the entropy of mtf (s) 
turns out to be much closer to the sum (5). For example, for the strings in 
Example 2 we have mtf (S 1 S 2 ) = 0"10"1 so that (si« 2 )) is exactly equal 

+ ^Hq{s2). 

Prom (4) it follows that if we compress mtf (s) up to its zeroth order entropy 
the output size can be bounded in terms of the sum kil-^o(s>)- This result 
enables us to prove the following bound for the output size of BWO. 

Corollary 1. For any string s over A = {oi, . . . , 0 / 1 } and k >0 we have 

BWO(s) < 8Hk{s) + "I" log ft + 9) + /i 2 , (6) 

where pi,p 2 are defined in (2). 

Proof. Prom the above discussion we know that bw(s) can be partitioned into at 
most h'^ substrings each one corresponding to a permutation of a string Wg with 
w G A'’. Hence, by (3) and Theorem 1 the string s = mtf (bw(s)) is such that 

|s|Ho(s) <8Hfe(s) + 4|s| + /»*(2/ilogh + 9). (7) 

25 

In addition, by (2) we know that 

BWO(s) = Arit(s) < |s|Ho(s) + Mi 1^1 + M2> 

□ 



which, combined with (7), proves the corollary. 
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Note that any improvement in the bound (4) yields automatically to an 
improvement in the bound of Corollary 1. Although the constants in (6) are 
admittedly too high for our result to have a practical impact, it is reassming 
to know that an algorithm which works well in practice has nice theoretical 
properties. Our result somewhat guarantee that BWO remains competitive for 
very long strings, or strings with very small entropy. From this point of view, 
the most “disturbing” term in (6) is (ni -|- 2/25)|s| which represents a constant 
overhead per input symbol. This overhead comes in part (the term fxi) from the 
bound (2) on arithmetic coding, and in part (the term 2/25) from Theorem 1 
on mtf encoding. However, in our opinion the inefficiency which results from 
Corollary 1 is also due to the fact that the bound provided by the A:-th order 
entropy is sometimes too conservative and cannot be reasonably achieved. This 
is shown by the following example. 

Example 3. Let s = cc(ab)''. We have a« — 6", bg = c« = ca. This yields 

\s\Hi{s) = nHoib’^) + {n - l)Fo(a""^) -h 2Ho{ca) = 2. 

Hence, to compress s (which has length 2n + 2) up to its first order entropy we 
should be able to encode it using only 2 bits. □ 

The reason for which in the above example |s|/iri(s) fails to provide a reasonable 
bound, is that for any string consisting of multiple copies of the same symbol, 
for example s = a", we have Ho{s) = 0. Since the output of any compression 
algorithm must contain enough information to recover the length of the input, it 
is natural to consider the following alternative definition of zeroth order empirical 
entropy. For any string s let 

f 0 if|s| = 0, 

H*ois)=\ (l+Llog|s|J)/N if|s| 7 ^ 0 andi/o(s)- 0 , (8) 

( Hq{s) otherwise. 

Note that 1-1- [log |s|J is the number of bits required to express |s| in binary. In [21] 
it is shown that starting from Hq one can define a fe-th order modified empirical 
entropy which provides a more reaUstic lower bound to the compression ratio 
we can achieve using contexts of size k or less. 

The entropy has been used to analyze a variant of the algorithm BWO. 
This variant, called BVOrl, has an additional step consisting in the run-length 
encoding of the runs of zeroes produced by the Move-to- Front transformation^. 
As reported in [12] many BWT-based compressors make use of this technique. 
In [21] it is shown that for any A: > 0 there exists a constant gk such that for 
any string s 

BVOrUs) <{7 + e)\s\H;{s) + 5fc, (9) 

^ This means that every sequence of m zeros produced by mtf encoding is replaced by 
the number m written in binary. 
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where BV0rl{s) is the output size ofBVOjn, e » 10~^, and H^{s) is the naodified 
fc-order empirical entropy. The significance of (9) is that the use of run-length 
encoding maJces it possible to get rid of the constant overhead per input symbol, 
and to reduce the size of the multiplicative consteint associated to the entropy. 

To our knowledge, a bound similar to (9) has not been proven for any other 
compression algorithm. Indeed, for many of the better known algorithms (in- 
cluding some BWT-based compressors) one can prove that a similar bound can- 
not hold. For example, although the output of lz77 and lz78 is bounded by 
|s|iffc(s) -h 0(|s| log |s|/loglog |s|), for any A > 0 we can find a string s such 
that the output of these algorithms is greater than A|s|ifi (s) (see [16], obviously 
for such strings we have Hk{s) log|s|/loglog|s|). The algorithm PPMC [25], 
which has been the state of the art compressor for several years, predicts the 
next symbol on the basis of the I previous symbols, where / is a parameter of the 
algorithm. Thus, there is no hope that its compression ratio approaches the fc-th 
order entropy for k > 1. Two algorithms for which a bound similar to (9) might 
hold for any A: > 0 are DMC [9] and PPM* [8]. Both of them predict the next 
symbol on the basis of a (potentially) unbounded context and they work very 
well in practice. Unfortunately, these two algorithms have not been analyzed 
theoretically, and an analysis does not seem to be around the corner. 



4 Algorithmic issues 

In this section we discuss some of the algorithmic issues which arise in the 
realization of an efficient compressor based on the BWT. Our discussion is by no 
means exhaustive; many additional useful information can be found for example 
in [3] and [12]. The reader who wants to know everything about an efficient BWT- 
based compressor may look at the source code of the algorithm bzip2 which is 
freely available [34]. 

Most BWT-based compressors process the input file in blocks. A single block 
is read, compressed and written to the output file before the next one is con- 
sidered. This technique provides a simple means for controlling the memory 
requirements of the algorithm and a Umited capability of error recovering. As 
a general rule, the larger is the block size the slower is the cilgorithm and the 
better is the compression. In bzip2 the block size can be chosen by the user in 
the range from 100Kb to 900Kb. 

The most time consuming step of BWT-based algorithms is the computation 
of the transformed string bw(s). In Sect. 2 we defined bw(s) to be the string 
obtained by lexicographic sorting the prefixes of s. This view has been adopted 
in some implementations, whereas in other cases bw(s) is defined considering the 
lexicographic sorting of the suffixes of s. This difference does not affect signifi- 
cantly neither the running time nor the final compression. From an algorithmic 
point of view the problems of sorting suSixes or prefixes are equivalent; in the 
following we refer to the suffix sorting problem which is more often encountered 
in the algorithmic literatme. 
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The problem of sorting all suffixes of a string s has been studied even be- 
fore the introduction of the BWT because of its relevance in the field of string 
matching. The problem can be solved in linear time (that is, proportional to 
|s|), by building a suffix tree for s and traversing the leaves from left to right. 
There are three “classical” linear time cdgorithms for the construction of a suffix 
tree [23,35,37]. These algorithms require linear space, unfortunately with large 
multiplicative constants. The most space economical algorithm is the one by 
McCreight [23] which, for a string of length n, requires 28n bytes® in the worst 
case. For BWT algorithms this large storage requirement is a serious drawback 
since it limits the block size that can be used in practice (as we mentioned be- 
fore, larger blocks usually yield better compression). For this reason sufiix tree 
algorithms have not been commonly used for computing the BWT. Recently this 
state of affair has begun to change. A currently active area of research is the 
development of compact representations of suffix trees [1, 17]. For example, one 
the algorithms in [17] builds a suffix tree using 20 bytes per input symbol in the 
worst case and 10 bytes per input symbol on average for “real life” files. The use 
of these space economical sufiix tree construction algorithms for the BWT has 
been discussed in [4]. 

A data structure which is commonly used as an alternative to the suffix tree 
is the suffix array [20] . The suffix array A of a string s is such that Ai contains 
the index of the starting point of the ith suffix in the lexicographic order. For 
example for s = mississippi the suffix array is A = [11,8, 5, 2, 1, 10, 9, 7, 4, 6, 3]. 
Obviously, from the suffix array one can immediately derive the string bw(s). The 
suffix array for a string of length n can be computed in 0{n log n) time using the 
algorithms by Manber and Myers [20] or Larsson and Sadakane [19]. The major 
advantage of suffix array algorithms with respect to suffix tree algorithms is 
their small memory requirements. Both Manber-Myers’s and Larsson-Sadakane’s 
algorithms only need an auxiliary array of size n in addition to the space for 
their input and output (the string s eind the suffix array A). Their total space 
requirement is therefore 9 bytes for input symbol. 

In [19] Larsson and Sadakane report the results of a thorough comparison 
between several suffix sorting algorithms. They use test files of size up to 125MB 
therefore they only consider algorithms with small memory requirements. In ad- 
dition to their suffix array construction algorithm they have tested a space eco- 
nomical suffix tree construction algorithm described in [17], the Manber-Myers 
suffix array construction algorithm as implemented in [24], and the Bentley- 
Sedgewick string sorting algorithm [5]. From the results of their extensive testing 
it turns out that the performances of suffix sorting algorithms are significantly 
influenced by the average Longest Common Prefix (average LCP from now) be- 
tween adjacent suffixes in the sorted order®. For files with a small average LCP 
(up to 20.1), the fastest algorithm is the Bentley-Sedgewick algorithm. Note that 

* In this section space requirements are computed eissuming that an input symbol 
requires 1 byte and an integer requires 4 bytes. 

® The average LCP can be seen also eis the average number of symbols which must be 
inspected to distinguish between two adjacent suffixes in the sorted order. 
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this is a generic string sorting algorithm which do not make explicit use of the 
fact that the strings to be sorted are the suflBxes of a given string. For files with 
a larger average LCP the fastest algorithm is Larsson-Sadakane’s. For the file pic 
— consisting of a black and white bitmap with an average LCP of 2,353.4 — 
the fastest algorithm is the space-economical suffix tree construction eilgorithm 
from [17]. 

5 BWT veiriants 

In this section we describe two recently proposed variants of the Bmrows Wheeler 
transform which we believe promise interesting developments. 

The first variant is due to Schindler [32] and consists of a transform bws*; 
which is faster than the BWT and produces a string which is still highly com- 
pressible in the sense discussed in Section 2. Given a parameter A; > 0, Schindler’s 
idea is to sort the rows of the cyclic shifts matrix of Fig. 1 according to their last 
k symbols only. In case of ties (rows ending with the same A:-tuple) the relative 
order of the unsorted matrix must be maintained. For the example of Fig. 1 if 
the sorting is done with fc = 1 the first column of the sorted matrix becomes 
mssp«ipisisi, so that hvsi{mississippi) — msspipisisi. Schindler proved that 
from bwsfc(s) it is possible to retrieve s with a procedure only slightly more 
complex than the inverse BWT. 

This new transformation has several attractive features. It is obvious that 
computing bws*,(s) is faster than computing bw(s) especially for small values 
of k. For example, suffix array construction algorithms can be used to compute 
bwsfe(s) in 0(|s| log k) time. It is also obvious that if u; is a length-A; substring of 
s, the symbols following u; in s are consecutive in bwsfc(s). Hence, the properties 
of bw(s) hold, up to a certain extent, for the string bwsfc(s) as well. In particular, 
bwsfc(s) will likely consists of the concatenation of substrings containing a small 
number of distinct symbols and we can expect a good compression if we process 
it using mtf -H Arit (that is, mtf encoding followed by zeroth order arithmetic 
coding). Reasoning as in Section 3 it is not difficult to prove that the output 
size of bwsfe + mtf -|- Arit can be boimded in terms of the A:-th order entropy 
of the input string. Note that by choosing the parameter k we can control the 
compression/speed tradeoff of the algorithm (a larger k will usually increase 
both the running time and the compression ratio). 

Shindler has implemented this modified transform, together with other minor 
improvements, in the szip compressor [33] which is one of the most effective 
algorithms for the compression of large files (see the results reported in [2]). 

The second important variant of the BWT has been introduced by Sadakane 
in [30]. He observed that starting with the output of a BWT-based algorithm we 
can build the suffix array of the input string in a very efficient way^®. Since the 

More precisely, at an intermediate step in the decompression procedure we get the 
string bw(s) from which we can eeisily derive the suffix array for s. It turns out, that 
building the suffix array starting from the compressed string is roughly three times 
faster than building it starting from s. 
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siifiix array allows fast substring searching, one can develop efficient algorithms 
for string matching in a text compressed by a BWT-based algorithm. Sadakane 
went further, observing that the suffix array of s cannot be used to solve effi- 
ciently the important problem of case-insensitive search^^ within s. Therefore he 
suggested a modified transform bwu in which the sorting of the rows of the cyclic 
shifts matrix is done ignoring the case of the alphabetic symbols. He called this 
technique unification and showed how to extend it to multi-byte character codes 
such as the Japanese EUC code. Sadakane has proven that even this modified 
transform is reversible, that is from bwu(s) we can retrieve s (with the correct 
case for the alphabetic characters!). Preliminary tests show that the use of this 
modified transform affects the running time and the overall compression only 
slightly. These minor drawbacks are more than compensated by the ability to 
efficiently perform case insensitive searches in the compressed string. 

We believe this is a very interesting development which may become a def- 
inite plus of BWT-based compressors. The problem of searching inside large 
compressed files is becoming more and more important and has been studied 
for example also for the dictionary-based compressors (see for example [11]). 
However, the algorithms proposed so far are mainly of theoretical interest and, 
to our knowledge, they are not used in practice. 



6 Conclusions 

Five years have now passed since the introduction of the BWT. In these five 
years our understanding of several theoretical and practical issues related to 
the BWT has significantly increased. We can now say that, far from being a 
one-shot result, the BWT has many interesting facets and that it is going to 
deeply influence the field of lossless data compression. The vciriants described in 
Section 5 are especially intriguing. It would be worthwhile to investigate whether 
similar variants can be developed for the lossless compression of images or other 
data with a non-linear structme. 

The biggest drawback of BWT-based algorithms is that they are not on-line, 
that is, they must process a large portion of the input before a single output bit 
can be produced. The issue of developing on-line counterparts of BWT-based 
compressors has been addressed for excimple in [14,22,28,39], but further work 
is still needed in this direction. 
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Abstract. We consider efficiency of ATC-algorithms for pattern-search- 
ing in highly compressed one- and two-dimensional texts. ’’Highly com- 
pressed” means that the text can be exponentially large with respect 
to its compressed version, and ’’fast” means ”in poly logarithmic time”. 
Given an uncompressed pattern P and a compressed version of a text 
T, the compressed matching problem is to test if P occurs in T. Two 
types of closely related compressed representations of 1-dimensional texts 
are considered; the Lempel-Ziv encodings (LZ, in short) and restricted 
LZ encodings (RLZ, in short). For highly compressed texts there is a 
small difference between them, in extreme situations both of them com- 
press text exponentially, e.g. Fibonacci words of size N have compressed 
versions of size O(logA) for LZ and Restricted LZ encodings. Despite 
similarities we prove that LZ-compressed matching is P-complete while 
RLZ-compressed matching is rather trivially in NC. We show how to im- 
prove a naive straightforward NC algorithm and obtain almost optimal 
parallel RLZ-compressed matching applying tree-contraction techniques 
to directed acyclic graphs with polynomial tree-size. As a corollary we 
obtain an almost optimal parallel algorithm for LZW-compressed match- 
ing which is simpler than the (more general) algorithm in [11]. Highly 
compressed 2-dimensional texts are also considered. 



1 Introduction 

In this paper we consider algorithms dealing with manipulations of highly com- 
pressed one- and two-dimensional texts (two dimensional arrays filled with sym- 
bols from a finite alphabet). The objects considered are potentially exponentially 
compressed. Denote by T a compressed version of a text T. The compressed pat- 
tern matching problem for one- and two-dimensional texts is 

Input: given explicitly pattern P tind a compressed version T of T, 
Output: ”yes” if P occurs in T, otherwise ’’not”. 

The main size of the problem is the size n — [T] of the description of the 

text T. Denote |P| = m. Our model of computations is a CRCW PRAM , see 
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[13]. The family of Lempel-Ziv encodings is one of the most successful compres- 
sion techniques for 1-dimensional texts. We consider what is known as LZl or 
LZ77 encoding and its variations. There are many different variations of LZ- 
encodings, we use the one based on fcictorizations of texts, similarly as in [9]. 
In LZ-encoding a given text w is represented as a composition of its subword- 
components (called here /actors): w = /i /2 /s ■■■fk In LZ-encodings each 
fi is a subword io[p . . . 9] of /1/2 . . . /i-i or /i £ E and in the former case we 
identify fi with the pair [p, gj. In the Restricted LZ-encoding each fi is assumed 
to be a composition fpfp+i ■ ■ ■ fq of consecutive earlier factors, for q < i, ot 
fi £ E, in the former case we identify fi with [p, gj. 17 is an input alphabet. 

Observation 1. Observe that in LZ-encoding integers p, g correspond to posi- 
tions in w and in RLZ -encoding they correspond to indices of factors, and are 
smaller. For highly compressed texts the numbers p, g can have n bits in LZ- 
encoding but only log n bits in RLZ-encoding. On the other hand RLZ encoding 
can have more factors. 

Example. Define Fibonacci words by: F\ =6; F 2 = a; Fk +2 = Fk+i ■ Fk, for 

A: > 0 Then the following factorization of Fg is both a LZ and a RLZ factorization: 

Fibg = abaababaabaababaababa = /i /2 fs C4 fy cg fy 
= a b a aba baaba ababaaba ba 

Hence an LZ-encoding of Fibg is a b a [1,3] [2,6] [4,11] [2,3] 
while a possible RLZ-encoding is a 6 a [1,3] [2,4] [4,5] [2,3]. 

It can happen that LZ factorization is not a RLZ one, for example LZ factoriza- 
tion [ ababbaabb = a b ab ba abb ] is not a Restricted LZ-factorization. We omit 
the proof of the following fact (point (1) has been essentially shown in [9] using 
idea of so called fingers). 

Lemma 1. 

(1) If n is the number of factors in a LZ-encoding of w then there is a RLZ- 
encoding of w with O(n^) factors. (2) Assume |27| == 0(1) and w has a LZ- 
encoding of total size s, counting all bits, such that s is poly logarithmic with 
respect to jzaj. Then there is a RLZ-encoding of w with total number of bits 
0(s -log(s)). 

The basic idea behind lossless 1-dimensional compression is to make use of rep- 
etitions (e.g. in LZ-encoding), In abstract terms it can be represented as a se- 
quence of recurrences: next portion of a text uses references to previously spec- 
ified portions. The abstract formulation of such sequences of recurrences can be 
done using terminology of straight-line programs (SLP’s, in short) , which are, 
informally speaking, sequences of equations involving variables and constants. 
Sequential algorithms for SLP-compressed matching were considered in [18]. We 
shall prove that RLZ-encodings can be easily transformed into SLP-encoding 
with logarithmic increase in size. The advantage of SLP-encodings is that it is 
much easier to design algorithms for such type of description, they play similar 
role as Chomsky normal forms for context-free grammars. SLP’s are especially 
convenient tool to construct NC algorithm for counting number of occurrences 
of a pattern in a RLZ-compressed text using SLP’s. 
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Another variation of Lempel-Ziv approach is the Lempel-Ziv- Welch compression 
(LZW, in short), which is one of the mostly used text compression and is fre- 
quently used under Unix. We refer to [21] for detailed definition. Basically it is 
a Restricted LZ compression when we can extend factors by single letters only. 
An eSicient sequential algorithm for LZW-compressed matching was presented 
in [Ij. 

Observation 2. In the factorizations we do not assume that the next factor is 
the longest possible, there are possibly many different RLZ-encodings of the same 
text. LZW-encodings can be treated as RLZ-encodings such that each faictor is 
a single letter or an extension of an earlier factor by a single letter. 

The last compression method discussed in the paper is high compression of 
2-dimensional texts in terms of simple 2-dimensional grammars, or straight- 
line programs (sequences of recurrences), see [3]. The 2-dimensional SLP’s are 
equivalent to deterministic acyclic finite automata describing images. We change 
terminology of states to variables, or nonterminals. Using such terminology it 
is easier to design our algorithms and it unifies the treatment of one- and two- 
dimensional case. Finite automata are a well established tool for the compression 
of 2-dimensional texts (images), see [16,7]. They can describe quite complicated 
images, for example deterministic automata can describe the Hilbert’s curve with 
a given resolution, while finite weighted automata can describe even much more 
complicated curves, see also [5,6]. 

Our main results are: 

P-completeness of LZ-compressed matching. 

Almost optimal parallel algorithm for RLZ-compressed matching. 

Almost optimal parallel algorithm for LZW-compressed matching which 
is simpler than the algorithm for fully compressed matching in [11]. 
Efficient NC-algorithm counting number of occurrences of the pattern. 
Efficient parallel algorithm for 2-dimensional matching when compression 
is in terms of recurrences or finite automata (in the sense of [15]). 

2 P-completeness of LZ-compressed matching 

The difference between sequential and NC-computations can be well demon- 
strated by the following problem: compute the symbol T[i] where T is the text 
given in its LZ-encoding or RLZ-encoding, eind i is given in binary. This task 
has a trivial sequential linear time algorithm but if we ask for an NC-algorithm 
for the same problem the situation is different, and it becomes quite difficult 
(P-complete for LZ-encoding and possibly P-complete for RLZ-encodings). Any 
straightforward attempt of using the doubhng technique and squaring the matrix 
of positions fails, since the number of positions in the text whose compressed 
version has size n can be J?(2"). 

We use the following auxiliary problem called the iterated mod problem 




Efficiency of Fetst PareJlel Pattern Searching in Highly Compressed Texts 



51 



Input: given positive integers x, mi, m 2 , ■■■ , mk 

Output: ”yes” if ((. . . (a: mod mk) mod mk-i) • • • mod mi) = 0. 

Lemma 2. [17] 

The iterated mod problem is P-complete. 



ml m2 



m3 



= a 0 = b 



>oo $,»oo«o$,«oo«o*oo#o«%*oo*o»oo#o««oo»o»oo»o# 

mod m3 



mod ml 



mod m2 



Fig. 1 . The structure of tn(a:, mi, m 2 , .. . ,m/t) for (x, mi, m 2 , . . . .m*) = (21,3,5,11). 



Theorem 1. 

(1) The problem of testing for any occurrence of an uncompressed pattern in a 
LZ-compressed text is P-complete. 

(2) The problem of computing a symbol on a given position in a LZ-compressed 
text is P-complete. 

Proof. Consider an instance x, m\,m 2 , . . ■ ,mk oi the iterated mod problem. 
We can assume w.l.o.g. that m\ < m 2 < . .. < mk < x. 

Construct a string w = w{x, mi, m 2 , • • • , mk) whose positions are counted 
starting from 0. Let $ 1 , $2, . . . $fc, # be special different symbols. The structure 
of w is 



w{x, mi, m 2 ,... ,mk) = Wl$l«l2$2Ul3$3 • ■■'Wk^kV#, 

where: liUi] = mj for 1 < t < m, |u| = x, w, is a period of tUj+ifor i < k, 

Wk is a period of v and wi = ab”*^~^. 

Claim 1. The following conditions are equivalent: 

1. {{. . . {x mod mk) mod mk-i) . . . mod mi) = 0 

2. w contains the symbol a at the penultimate position of |u;| 

3. the pattern a# occms in w 

It is now sufficient to show the following fcict. 

Claim 2. LZ-encoding of w{x, mi, m 2 , ... , mk) has number of bits polynomial 
with respect to the total number of bits in an instance (x, mi , m 2 , . . . , mk), and 
it can be computed by an NC-algorithm. 

Ecich uii+i is of a form w\wi, where f is a nonnegative integer and is a prefix 
(possibly full) of Wi. Let 2^ be the largest power of two not exceeding t. Assume 
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the position of the first symbol of tOi-i-i is r + 1 , then encoding of the segment 
is [r-|u;i|,r--l], [T- + l,r + mi], [r+l,r’ + 2 mj], . . . [r + l,r- + 2 ^'“^mi], [r + 
l,r + rui+i — 2^mi]. 

For example the encoding of the word w = «;(69, 2, 3) = a 6 $ia 6 a$ 2 (a 6 a)^^a 6 # is 
a b $1 [0,1] a $2 [3,5] [7,9] [7,12] [7,18] [7,30], [7,29] #. 

Such encodings can be easily constructed in parallel. This completes the proof. 

We show in the next section that the problem of RLZ-compressed matching is 
much easier and it is in NC, however we want to emphasize that in general 
NC-computations related to RLZ-encodings are difficult. We conjecture that the 
following problems related to Restricted LZ-encodings are P-complete (the first 
two axe P-complete for LZ-encodings). 

1. Computing the i-th symbol of T when T is given by its RLZ-encoding. 

2. The fully compressed pattern-matching-, both text T and the pattern P are 
given in RLZ-compressed form. 

3. Given two RLZ-encodings, do they describe the same text ? 

Several P-complete problems related to text compression were presented in [ 8 ]. 



3 Naive NC-algorithm for RLZ-compressed matching 

Despite the fact that computing a symbol on a given arbitrary chosen position 
of RLZ-encoded text is quite possibly P-complete we use NC-computation of 
symbols at some easy positions (close to defined later critical positions). The 
LZ-encoding partitions the texts into factors, we call here the beginning and 
starting positions of these factors critical positions. We call a positions which 
is at distance at most m from a critical position a working position. Due to 
Lemma 3 it is enough to compute in NC all the symbols at working positions. 
This problem in P-complete for LZ-encoding but it is in NC for RLZ-encoding. 
We can use Observation 2.1 from [9], which basically says the following. 



Lemma 3. Assume that T is compressed using LZ or RLZ-encoding. Then the 
pattern P occurs in T iff it occurs within a distance m from a critical position. 

Assume w is a text compressed by a RLZ-encoding. For a position i in rt; denote 
by LINK{i) the position which is computed as follows: if the factor contains the 
position i is a single symbol then LINK{i) — i, otherwise the factor containing 
i is a composition of a sequence of eairlier factors. The position i corresponds to 
some position j in one of these factors, we set LINK{i) = j. We need to know 
lengths of factors to decide how they are split and in which component factor 
LINK{i) is located. 

Observe that if i is a working position then LINK{i) is also in case of 
RLZ-encodings. It is the main property of RLZ firom the point of view of NC 
algorithmics. Such property does not hold for LZ-encodings. 
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Theorem 2. The naive algorithm solves RLZ-compressed matching problem in 
a polylogarithmic time with 0{n^ • m^) processors. 

Proof. There axe the following tasks in the algorithm which need many proces- 
sors: 

1. Computation of lengths of factors, we cem use parallel computation of straight- 
line programs containing only the -I- operation. This needs n^ processors 

2. Computation of LINK* needs a trcuisitive closure of a directed graph with nm 
vertices, since we have 0(nm) working positions. A straightforward algorithm 
needs cubic number of processors. 

3. Searching on each overlap individually needs 0(nm) processors, m processors 
for each of n critical positions. 



Algorithm Naive RLZ-matching; 

Comment: given pattern P and a RLZ-encoding of T 

(1) compute in parallel lengths of all factors; 

Comment: it needs O(n^) processors 

(2) for each working position i do in parallel; 

compute LINK{i) 

(3) compute the closure LINK* of the graph defined by LINK table; 

(4) for each working position i compute the i-th symbol using LINK*{i) 

(5) for each critical position i do in parallel; 

check if there is an occurrence of P overlapping i; 

Comment: such occurrence contains only working positions; 

(6) return "Yes" if there is at least one occurrence; 



The number of processors can be easily reduced to n^ + nm but further reduction 
is more complicated. 

Observation 3. It is not clear how to extend the naive algorithm to compute 
(by an NC algorithm) the number of cill occmrrences. The main diflSculty is that 
now we have also to consider occurrences of the pattern in T which do not 
contain any working position. We shall see that straight-line programs approach 
is a convenient tool to solve it. 

4 Almost optimal parallel RLZ-compressed matching 

In this section we show that RLZ-compressed matching can be solved by a NC 
algorithm whose work is hnear within a polylogarithmic factor. 

Theorem 3. The RLZ-compressed matching problem can be solved in 0(log^(m)-f- 
log^(n)) time with 0(n) processors. 



LZW-encodings can be treated as special RLZ-encodings. 
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Corollary 1. There is an almost optimal NC-algorithm for LZW-compressed 
matching. 

Before going into the proof of Theorem 3 we reduce RLZ-matching to search- 
ing patterns in texts with short recursive descriptions described by straight- 
hne programs. A short description of one-dimensional texts is in a form of a 
sequence of recurrence equations, or (more formally) straight-line programs, 
see [18]. A straight-line program V, is a, sequence of assignment statements: 
Xi = expr.^] X 2 = expr^-, • . . ; X„ = ea5>r„, 
where Xi are variables and expr^ is a symbol in E, or expr^ — Xj-Xk, for some 
j,k < i. We assume here that all variables are relevant, this means that we 
cannot remove any equation from the straight hne program without affecting its 
value. It means, in terms of grammars, that there are no useless variables. Such 
short descriptions are equivalent to acyclic context-free grammars in Chomsky 
normal form generating a single text. Testing useless variables is P-complete 
for general context-free grammars but it can be shown that it is in NC if each 
variable appears only once on the left-hand side of a production, as it is in our 
case. Denote by R the string which is the value of the last variable AT„ after 
the execution of the program TZ. We identify variables with their values in the 
sequel. 

Lemma 4. RLZ-encodings can be transformed into SLP-encoding with logarith- 
mic increase in size in O(logn) time with 0{n) processors. 

We assume now (to the end of this section) that we deal with SLP-encodings. 
We say that a pattern P occurs on an overlap iff P occurs in a variable A in 
such a way that it starts in B and ends in C, where A = BC is a rule of the 
straight-line program. The one-dimensional texts with short description have the 
following useful property 

Lemma 5. [overlap-lemma] For |P| > 2 each match of P in T occurs on an 
overlap. 

We say that a variable A is small iff the text generated by it does not exceed m. 

The tree-size of a node of a directed cicyclic graph G is the number of paths 
from this node to a leaf. If a graph corresponds to a circuit computing an expres- 
sion then the tree-contraction approach can be used in a straightforward way to 
compute nodes whose tree-size is polynomial. 

Lemma 6. The exact lengths of all small variables in the straight line program 
TZ can be computed in 0(log(m) +log^(n)) time with 0{n) processors. The vari- 
ables whose lengths are not computed are not small. 

Proof. We can replace concatenation of variables (whose values are strings) by 
addition of numbers (lengths of variables). We can view TZ as a directed acyclic 
graph G, in which variables correspond to nodes, the only operation in the nodes 
is addition and each variables has edges leading to the variables which are on 
the right-hand side of a corresponding equation. 
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Then we can use the tree contrciction algorithm called parallel pebble game, 
see [13]. This tree contraction works as well for binary directed acyclic graph G 
(where the notion of the sons is well defined), and after 0(log(m) parallel time 
the values of all variables are computed whose tree-sizes do not exceed m. In om 
case it is easy to see the following fact. 

Claim 1. The tree-size of a node v corresponding to a variable x equals the 
length of the string generated from x. 

Hence after logarithmic number of applications of our tree-contraction applied 
to directed acychc graphs the lengths of all small variables are computed. The 
lengths of not computed variables are larger than m. This completes the proof. 

Proof, (of Theorem 3) We apply first Lemma 6 to compute exact lengths of 
all small variables, if the length is not computed then it means that the length 
exceeds m. For each variable X denote by LEFT(X) [RIGHT{X)) the largest 
sufiix (prefix) of P which is a prefix (suffix) of X. 

Claim 1. We can preprocess the pattern P in O(logm) time with hnear number 
of processors in such a way that we can answer in logarithmic time each query 
of the form: 

given i,j,p, q find r, s such that P[i..j] • P]p-q] = if such r, s exist. 

Proof (of the claim). We build the suffix tree of P in parallel using the algorithm 
from [2j. Each subword of P whose length is a power of two gets a name (the 
identical subwords have the same names). Using the shortcuts (whose lengths 
are powers of two) in the suffix tree we search for P[i-j] ■ P\p-q] in the sufiix 
tree in logarithmic time [10]. We omit the details. 

Claim 2. 

We can compute LEFT{A) and RIGHT{A) for each variable A in 0(log^(m) • 
log(n)) time with 0{n -t- m) processors. 

Proof (of the claim). We only show how to compute RIGHT{A) for each 
A. Denote by RIGHT' {A) the subword of P which equals the value of A, or, if 
there is no such subword then RIGHT' {A) — RIGHT{A). Denote also by ® the 
operation of composing two subwords u, w of P. If uw is a subword of P then the 
result of is this subword (given by the first and the last position), otherwise 
it is the Imgest sufiix of uw which is a prefix of P. Consider the dependency graph 
G of a given straight line program describing T. G is a directed acyclic graph, 
each node (variable) has a left and right son, unless it is a leaf. For each variable 
X let Y be the first variable when traversing from X using right branches until 
finding a node (variable) whose right son is a small variable. Let sons of Y by 
{7, W, then we create an equation X = U ® W. The computation of Y for each 
X is a again reducible to some tree computations. In this way we get a new 
straight-Une program R' which computes RIGHT' {X) for each variable X. The 
graph of R' has the property that the tree-size of each right son is 0{m) since 
it corresponds to a small variable, consequently the following basic property of 
R' is: the tree-size of each variable with respect to R' is 0{n • m). 
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Now we can compute RIGHT' (X) for each variable X applying a tree-contraction 
to the graph of TZ'. This graph is not a tree, but its tree-size is polynomial, so 
the number of iterations in the tree-contraction method applied to this graph is 
logarithmic. The operations 0 can be performed in O(logm) time, due to Claim 
1 if it involves two subwords. If we have a prefix concatenated with a subword 
we can use border trees from [14]. The value of RIGHT{X) can be recovered 
from RIGHT'(X). This completes the proof of the claim. 

For each variable A with the equation A = B G we can check if occmrence of 
P overlaps splitting point of A by testing if P occurs in RIGHT{A) • LEFT{B). 
This can be done fast for all A’s after preprocessing the pattern (using border 
trees of [14]). 

Theorem 4. The number of occurrences and the first occurrence of the pat- 
tern in RLZ-encoded text can he found in polylogarithmic time with 0(n^ -I- m) 
processors. 

Proof. First we show how to compute number of occurrences. At the begin- 
ning we compute the number ffoverlap-OCC-in(X) of occurrences overlapping 
the splitting point of each variable X. Then the problem is reduced to the com- 
putation of the number of occurrences #X of each variable X in the generation 
tree of the text due to the following fact. 

Claim 1. 

The number of occurrences equals YLx is a variabui^^ ' #overlap.occ.in{X)). 

Assume all variables are relevant. For each variable X let fathers(X) be the 
set of possible father variable of X in the generation tree. The numbers ffX 
can be computed using NC-computation of straight-line programs involving the 
operation -t- and using for each non-root variable the equation. 

— Y^Y^fatheTs(X) 

In the generation-tree for the terminal text we can search for the the first variable 
from the left and from the bottom such that the pattern occurs on an overlap 
in this variable. We call such variable a good variable. We have already shown 
that testing for an existence of an occurrence of a pattern on an overlap in each 
variable can be done within required complexities. Hence the problem is reduced 
to finding the good variable. First for eeich two variables we compute in parallel 
if it first occurs in the tree generated by the latter one. This can be done using 
a transitive closure of a graph G from the proof of Lemma 6. Then we compute 
sizes of all variables (not only small ones). The computation can be done by 
using a parallel transitive closure algorithm working in log^n time with O(n^) 
processors. Once we know the sizes of ecich variable and we know which of them 
contains the pattern on an overlap we can travel in the generation tree (or rather 
in a directed acyclic graph describing it succinctly) from the root down to the 
first good variable. Technically it can be done by building another graph G'. For 
each equation A = BG we create edges A^ B of weight 0 and the edge A -4 C 
of weight |5]. The computation of a first position of an occurrence is reduced 
to computing a shortest path from the root variable to a variable containing the 
pattern on an overlap. This completes the proof. 




Efficiency of Feist Parallel Pattern Searching in Highly Compressed Texts 



57 



5 Searching for two-dimensional patterns 

We extend now the pattern-matching problem for shortly described texts to a 
two-dimensional case using description in terms of restricted grammars, straight- 
line programs and (equivalently) of deterministic acyclic automata, see [15]. 
Many interesting fractals drawn with a given finite resolution can be described 
using such formalism, e.g. Hilbert’s ciurve, see [3]. The unrestricted recmrences 
describing 2-dimensional texts in [3] are much stronger and for them the pattern- 
matching problem for highly compressible 2d-texts is NP-complete. Our descrip- 
tion is more restrictive, since we always compose objects of a same shape. This 
implies existence of polynomial time algorithms, see [15]. We use the operation 
of constructing a square table of 4 smadler tables of a same shape. The operation 
Compose places the squares generated by Xj, Xk, Xg, Xr in a once fixed order. 
In this way the square of shape 2* is composed of 4 already defined squares of 
shapes 2*“^ x 2*“^. The length of a side of each square corresponding to a vari- 
able is of a form 2‘ for some t, the number t is called the rank of this variable 
and also the rank of the 2-dimensional text generated by the it. A 2-dimensional 
simple straight-line program TZ is a sequence of assignment statements where 
each variable has its rank 

Xi = expr^-, X 2 = expr^-, • • . ; = expr^, 

Xi are variables and expr^ is a symbol in S, in this case rank{xi) = 0, or earpr^ 
= Compose{Xj, Xk, Xg,Xr), for some j, k,s,r< i, and Xj,Xk, X,, Xr are vari- 
ables of the rank t — 1 where t = rank{xi). We assume here also that all variables 
are relevant, this means that we cannot remove any equation from the straight 
line program without affecting the fined value. The equivalent short description 
of 2d-text is possible in terms of deterministic acyclic automata, see [15]. 

The main problem with 2-dimensional texts is that now there is no analog of 
Lemma 5. There are possible situations when the pattern ocemrs in T but it 
does not occurs on the overlap of all variables in Compose{Xj, Xk, X,, Xr). 
The main idea is to define special blocks called regular blocks: quadrants of the 
whole squcire and quadrants of these quadrants (recursively). Then we consider 
vertically (horizontally) overlapping occurrences of a pattern which are the ones 
which touch the middle point of a (vertical or horizontal) boundary between two 
adjacent regular blocks of a same rank. 

Using tree-contraction techniques similcirly as for RLZ-compression we can prove 
the following result related to 2-dimensional compressed matching. 



Theorem 5. The pattern-matching problem for 2d-texts with short descriptions 
can be solved in 0(log^(n -H m)) time with 0{n^ • m n®) processors, or 0{n -\- 
logm) time with 0{n ■ {n + m)) processors. 
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Abstract. The diflFerent semantics that can be ^kssigned to a logic pro- 
gram correspond to different assumptions made concerning the atoms 
whose logical values cannot be inferred from the rules. Thus, the well 
founded semantics corresponds to the assumption that every such atom 
is false, while the Kripke-Kleene semantics corresponds to the cissump- 
tion that every such atom is unknown. In this paper, we propose to unify 
and extend this assumption-based approach by introducing parameter- 
ized semantics for logic programs. The parameter holds the value that 
one assumes for all atoms whose logical values cannot be inferred from 
the rules. We work within Belnap’s four-valued logic, and we consider 
the class of logic programs defined by Fitting. 

Following Fitting’s approach, we define a simple operator that allows us 
to compute the parameterized semantics, and to compare and combine 
semantics obtained for different values of the parameter. The semantics 
proposed by Fitting corresponds to the value false. We also show that 
our approach captures and extends the usual semantics of conventional 
logic programs thereby unifying their computation. 

Keywords. Four-valued logics, logic programming, logics of knowledge, 
inconsistency. 



1 Introduction 

The different semantics that can be assigned to a logic program correspond to 
different assumptions made concerning the atoms whose logical values cannot 
be inferred from the rules. For example, the well founded semantics corresponds 
to the assumption that every such atom is false (Closed World Assumption), 
while the Kripke-Kleene semantics corresponds to the assumption that every 
such atom is unknown. In general, the usuEil semantics of logic programs are 
given in the context of three-valued logics, and are of two kinds: those based on 
the stable models [7, 11, 12] or on the well-founded semantics [14, 15], and those 
based on the Kripke-Kleene semantics [3]. 

We refer to semantics of the first kind eis pessimistic, in the sense that it 
privileges negative information: if in doubt, then assume false; and we refer to 
semantics of the second kind as skeptical, in the sense that it privileges neither 
negative nor positive information: if in doubt, then assume nothing. To illustrate 
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these semantics, consider the following program: 






convict (X) < 'innocent (X) A suspect (X) 

free(X) <— innocent(X) A suspect(X) 

innocent(X) ■«— free(X) 
suspect(John) <— 



The only assertion made in the program is that John is suspect, but we know 
nothing as to whether he is innocent. 

If we follow the pessimistic approach, then we have to assume that John 
is not innocent, and we can infer that John must not be freed, and must be 
convicted. If, on the other hand, we follow the skeptical approach, then we have 
to assume nothing about the innocence of John, and we can infer nothing as to 
whether he must be freed or convicted. 

However, in the context of three-valued logic, one can envisage a third se- 
mantics, that we call optimistic: if in doubt, then assume true. If we follow this 
approach, then we have to assume that John is innocent, and we can infer that 
John must be freed, and must not be convicted. 

Now, the optimistic approach can be seen as a counterpart of the pessimistic 
approach. To find a counterpart for the skeptical approach, one has to adopt a 
four-valued logic. In such a logic, one Ccin envisage an inconsistent semantics: 
if in doubt, then assume both false and true. Table 1 summarizes the four pos- 
sible semantics of V, where .F, T, U and I stand for false, true, unknown and 
inconsistent, respectively. 



Table 1 - The four possible semantics of V 



Approach 


suspect (John) 


iimocent(John) 


free(John) 


convict (John) 


Pessimistic 


T 


T 


T 


T 


Optimistic 


T 


T 


T 


T 


Skeptical 


T 


U 


U 


U 


Inconsistent 


T 


I 


X 


X 



In this paper, we define the semantics of a program V using a parameter a 
whose value can be any of the above four logical values. Once fixed, the value of 
a represents the “default value” for those atoms of V whose values cannot be in- 
ferred from the rules. We define a simple operator that allows us to compute this 
parameterized semantics, and also to compare and combine semantics obtained 
for different values of a. We show that om semantics extends the semantics 
proposed by Fitting [6], and captures the usual semantics of conventional logic 
programs thereby unifying their computation. As a side-residt, we propose a new 
semantics for logic programs, that can be roughly described as a “compromise” 
between pessimistic and optimistic semantics. 

Motivation for this work comes from the area of knowledge acquisition, where 
contradictions may occur dining the process of collecting knowledge from differ- 
ent experts. Indeed, in multi-agent systems, different agents may give different 
answers to the same query. It is then important to be able to process the answers 
so as to extract the maximum of information on which the various agents agree, 
or to detect the items on which the agents give conflicting answers. 
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Motivation also comes from the area of deductive databases. Updates leading 
to a certain degree of inconsistency should be allowed because inconsistency 
can lead to useful information, especicilly within the framework of distributed 
databases. 

The use of four- valued logics is justified by the fact that it provides a more 
natural modeling framework for the application areas just mentioned. Moreover, 
as Arieli and Avron showed in [1], the use of four values is preferable to the use 
of three even for tasks that can in principle be handled using only three values. 

The remaining of the paper is organized as follows. In section 2, we recall very 
briefly definitions and notations from three- and four-valued logics, namely, sta- 
ble models and well-founded semantics, Kripke-Kleene semantics, Belnap’s logic 
and Fitting’s programs. We then proceed, in section 3, to define our parameter- 
ized semantics of a Fitting program V. This is done by defining a parameterized 
operator whose fixed points we call the a-fixed models of V. Our treatment in 
this section is inspired by [6]. If the value of the parameter a is false, then the 
a-fixed models correspond to the stable models proposed by Fitting. In section 
4, we restrict om: attention to conventional logic programs. We show that their 
a-fixed models capture the three-valued stable models, the well-foimded seman- 
tics, and the Kripke-Kleene semantics. We also provide a comparative study of 
the a-fixed models for the four values of the parameter a, and propose a “com- 
promise” between pessimistic and optimistic semantics that in certain cases may 
lead to the definition of a new semantics. Section 5 contains concluding remarks 
and suggestions for further research. Proofs of theorems are omitted due to lack 
of space. 

2 Preliminaries 

2.1 Three- valued logics 

Stable models and well founded semantics. Gelfond and Lifschitz intro- 
duced the notion of stable model [7], in the framework of classical logic under 
the closed world zissumption. This notion was then extended to three-valued 
logics and partial interpretations: Van Gelder, Ross and Schlipf introduced the 
well-founded semantics [14, 15], and Przymusinski defined the three- valued sta- 
ble models [11]. In fact, as shown in [12], Przymusinski’s extension captures both 
the bi-valued stable models and the well-founded semantics. 

In Przymusinski’s approach, a conjunctive logic program is a set of clauses 
of the form A < — Bi A ... A A -^Ci A ... A ~^Cm, where Bi , ..., Bn, Ci, ..., Cm 
are atoms. In this context, a valuation is a mapping that assigns to each ground 
atom a truth value from the set {false, unknown, true}. A valuation can be 
extended to ground litterals and conjunctions of ground litterals in the usual 
way. To define the stable models and well-founded semantics of a program V, 
one uses the extended Gelfond-Lifschitz transformation GL-p [11] which assigns 
to each valuation v another valuation GL-p{v) defined as follows : 

1. Transform V into a positive program V/v by replacing all negative litterals 

by their values from v. 
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2. Compute the least fixpoint of an immediate-consequence operator ^ defined 
as follows : 

— if the ground atom A is not in the head of any rule of Inst-7^/„^, 
then ^•p^^{v)[A) = false; 

— if the rule “A < — ” occurs in Inst-P/„, then ^-p^^{v){A) = true; 

— else 0-p^^{v){A) — \J{v{B)/A <— B E Inst-T’/u, where V is the extension 
of classical disjunction defined by: false V unknown = unknown; 
true V unknown = true; unknown V unknown = unknown. 

The valuation i; is a three- valued stable model of V if GL-p{v) — v, and the least 
three- valued stable model coincides with the well-founded semantics of V. 

It follows from the definition of ^ above that this approach gives greater 
importance to negative information, so it is a pessimistic approach. 



Kripke-Kleene semantics Working with three-valued logic, or Kleene’s logic 
as well. Fitting introduced the Kripke-Kleene semantics [3]. The program V has 
the same definition as for stable models, but the operator ^ is now defined as 
follows : given a valuation v and a ground atom A in Inst-P, 

1. if there is a rule in Inst-'P with head A, and the truth value of the body 
under v is true, then ^p(v){A) = true; 

2. if there is a rule in Inst-P with head A, and for every rule in Inst-P with 
head A the truth value of the body under v is false, then $p{v){A) = false; 

3. else ^p{v){A) = unknown. 

It follows that this approah gives greater importance to the lack of infor- 
mation since unknown is assigned to the atoms whose logical values cannot be 
inferred from the rules, so it is a skeptical approach. 



2.2 Four- valued logics 

Belnap’s logic. In [2], Belnap defines a logic called TOWR. intended to deal 
with incomplete and inconsistent information. Belnap’s logic uses four logical 
values, that we shall denote by T, T, U and J , i.e. TOWR. = {T, T, U, I}. 
These values can be compared using two orderings, the knowledge ordering and 
the truth ordering. 

In the knowledge ordering, denoted by <fc, the four values are ordered as 
follows: U <kT,U <k T, T <kI,T <k T- Intuitively, according to this ordering, 
each value of TOUR, is seen as a possible knowledge that one can have about the 
truth of a given statement. More precisely, this knowledge is expressed as a set 
of classical truth values that hold for that statement. Thus, T is seen as {false}, 
T is seen as {true}, U is seen as 0 and I is seen as {false,true}. Following this 
viewpoint, the knowledge ordering is just the set inclusion ordering. 

In the truth ordering, denoted by <t, the four logical values are ordered as 
follows: T <tU,T <t X, U <tT,X <i T. Intuitively, according to this ordering, 
each value of TOUR is seen as the degree of truth of a given statement. U and I 
are both less false than T, and less true than T, but U and I are not comparable. 
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The two orderings are represented in the double Hasse diagram of Figme 1. 

As shown in [8], TOUTl is a bilattice under the two orderings. Meet and 
join under the truth ordering are denoted by A and V, and they are natural 
generalizations of the usual notions of conjunction and disjunction. In particular, 
UhX— T andlfVl= T. Under the knowledge ordering, meet and join are denoted 
by (8i and ©, and are called the consensus and gullibility, respectively: x (S>y 
represents the maximal information on which x and y agree, whereas x®y adds 
the knowledge represented by x to that represented by y. In particular, .F(8>T = 
U and T®T= X. TOWZ. is an infinitciry distributive bilattice which satisfies 
the infinitary interlacing laws (i.e. each of the operations A , V , (g) , © is 
monotone with respect to both orderings, for example, if Vn 6 <5,o„<fe6„ then 
V{on|n € S)<k V{^n|n 6 5}). 

There is a natural notion of negation in the truth ordering denoted by -i, 
and we have: T= -i T= T, U= 14, -< X= I. There is a similar notion for 

the knowledge ordering, called conflation, denoted by -, and we have: - 14= I, 

-x=u,- j== T, - r. 

The operations V, A, -■ restricted to the values T and T are those of classical 
logic, and if we add to these operations and values the value 14, then they are 
those of Kleene’s strong three- valued logic. 



Fitting programs. Conventional logic programming has the set {!F, T} as 
its intended space of truth values, but since not every query may produce an 
answer, partial models are often allowed (i.e. U is added). If we want to deal 
with inconsistency as well, then I must be added. Thus it seems natural to work 
with Belnap’s logic. Fitting extended the notion of logic program to Belnap’s 
logic [6] as follows: 

- A formula is an expression built up from literals and elements of X’OUTZ, using 
A,V,®,©,3,V. 

- A clause is of the form P{x\, ...,Xn) ■< — <j>{xi,...,Xn), where the atomic formula 
P{x\, ...,Xn) is the head, and the formula (f>{xi, ...,x„) is the body. It is assumed 
that the free variables of the body are among xi , ..., Xn- 

- A program is a finite set of clauses with no predicate letter appearing in the head 
of more than one clause (this apparent restriction causes no loss of generality [4])- 

Inst-F/„ denotes the set of all instanciations of rules of V/„ 
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We shall refer to such an extended logic program as a Fitting program? 
Fitting also defined the family of conventional logic programs. A conventional 
logic program is one whose underlying truth-value space is the bilattice FOUTZ 
and which does not involve Such programs can be written in the 

customary way, using commas to denote conjunction. 



3 Parameterized semantics for Fitting programs 

In the following, a G FOUTZ, P is a Fitting program, V{FOUTZ) is the set of 
all valuations in FOUTZ and Inst-P is the set of all ground instances of rules of 
P. Some of the results in this section are inspired by [6] which deals only with 
the case a = F. 

3.1 Immediate Consequence Operators 

First, we extend the two orderings on FOUTZ to the space of valuations V {FOUTZ). 

Definition 1. Let vi and be in V (FOUTZ), then 

vi <t V2 */ o.nd only ifvi{A) <t V2{A) for all ground atoms A; 

^^1 '^2 If o,nd only ifvi{A) <k V2{A) for all ground atoms A. 

Under these two orderings V {FOUTZ) becomes a bilattice, and we have 
{v A w){A) = v{A) A w{A), and similarly for the other operators. V{FOUTZ) 
is infinitely distributive, satisfies the infinitely interlacing conditions and has a 
negation and a conflation. 

The actions of valuations can be extended from atoms to formulas as follows: 
v{X AY) = i)(A) A v{Y), and similarly for the other operators, 
r;((3a:)(?!)(a;)) = \J t=ciosedterm^i^{*))> and 

V{{'dx)4>{x)) = At=closedterm 

The predicate equal{x, y) is a predefined predicate defined by: for all valuations v, 
{v{equal{x, y) ii x = y and F x ^ y), and v{a) = a for all a in FOUTZ. 

The following contrajoin operation assigns to a ground atom A a truth value 
independently of the truth value assigned to the negation of A.^ 

Definition 2 (contrajoin). Let v and w be in V (FOUTZ). 

The contrajoin v Aw is defined as follows: for each ground atom A, 

vAw(A)=v(A) and vAw(^A)=->w(A) 

Contrajoin operations are extended to formulas by induction. The idea is 
that V represents the information about A, and w the information about ~^A. For 
example, if v {innocent {John)) = Tand w{innocent{John)) — U then v A 
w {innocent {John)) — T, whereas ->(t; A w{->innocent{John)) = U. 

^ Actually, Fitting defined logic programs in the context of general bilattices, but in 
this article we restrict our attention to the minimal bilattice, FOUTZ. 

^ Our contrajoin operation is exactly the srtme as pseudovaluation in [6]. However, we 
prefer the term contrajoin of v and w eis it is more indicative of the fact that an 
operation is performed on valuations v and w. 
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We can now define a new operator which is inspired by [6]. It infers new 
information from a contrajoin operation in a way that depends on the value of 
the parameter a. 

Definition 3. Let v and w be in V (TOWR,). The valuation ^^{v,w) is defined 
as follows: 

(1) if the ground atom A is not the head of any rule of Inst-V, then 'T^{v,w) (^) =a 

(2) if A <— B occurs in Inst-V, then ^^{v,w){A) = v A w{B). 

Clearly, the valuation ^p{v,w) is in V{!FOUTZ), and as the interlacing con- 
ditions are satisfied by V(VOUTZ), we can prove the following proposition. 

Proposition 1. Let V be a Fitting program. 

(1) Under the knowledge ordering, is monotonic in both arguments; 

(2) Under the truth ordering, is monotonic (and moreover continuous) in its 
first argument, and anti-monotonic in its second argument. 



We can infer from this proposition that the function \x.F^{x,v) has a least 
fixed point and a greatest fixed point for each ordering. We define now a new 
operator F'f, which associates to a valuation v one of these fixed points depend- 
ing on the value of a. F'^{v) is the iterated fixed point of Xx.F^{x,v) obtained 
firom an initial valuation Va defined by: ^<,(.4) = a, for all ground atoms A. 



Definition 4. Let v be in V{T0W1V). Define F%{v) to be the limit of the se- 
quence of valuations (an) defined as follows: 



Q/Q — I 

a„ = F^{an-i,v), for a successor ordinal n; 
' n<x^vi°‘n,v) fora = T 

I fora = I 



ax= < 



In fact, we fix the truth value of negative literals with v, then we compute 
the semantics of the positive program thus obtained (in a similar manner to that 
of Gelfond-Lifschitz transformation). 

We remark that F'^{v) is the least fixed point of Xx.F^{x,v) and F'^ 
the greatest fixed point of Xx.Fp{x,v) under the knowledge ordering. F'^{v) 
is the least fixed point of Xx.F^{x,v) and F'^{v) the greatest fixed point of 
Xx.F^{x,v) under the truth ordering. 

To illustrate this definition consider the following program V and let v be 
the valuation which assigns to every ground atom the truth value U: 



( A^BaC 



V{ 



D ^ © T 

E -<r— A <Si ~'D 

B^T 



Atom 


A 


B 


C 


D 




F'%(v) 


T 


T 


T 


r 


u 



To compute F'^{v), we first repl£ice all negative literals by the value U, then 
we compute the least model of the positive program thus obtained with respect 
to the truth ordering, beginning with the valuation which assigns to every ground 



atom the truth value T. 
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3.2 The family of a-flxed models 

We recall that a valuation u is a model of a program V if and only if for all rules 
A i — B in Inst-P, v{A) <t v{B) [5]. By definition of a valuation v that 
verifies u) = u is a model of V. Now, every fixed point m of verifies 
m) = — m, therefore m is a model of V. So 

we can define four new famihes of models that we call a-fixed models. 

Definition 5 (o-fixed models). A valuation v 6 V{!FOU'R.) is a a-fixed model 
of a program V if and only if v is a fixed point of . 

Prom now on, .F-fixed models will be called pessimistic, T-fixed models 
optimistic, i/-fixed models skeptical, and X-fixed models inconsistent. We can 
now study the family of a-fixed models. 

Theorem 1. is monotonic under <k, and anti-monotonic under <f 

Given the monotonicity of under the knowledge ordering and the com- 
plete lattice structure of V{J^OWR) under this ordering, we can apply the 
Knaster- Tarski theorem. 

Theorem 2. •P'p has a least fixed point , denoted Fixy, and a greatest fixed 
point, denoted Fixj, with respect to the knowledge ordering.^ 

We can remark that the computation of Fix^, that we call a-fixed semantics, 
is similar to the computation of the well-founded semeintics via the Gelfond- 
Lifschitz transformation. 

Four different semantics can now be associated to a Fitting program. The 
following example shows how a query could sometimes be evaluated in different 
contexts corresponding to different Vcilues of the “default value” a, Eind how one 
can choose the appropriate a-fixed semantics. 

Example. Let V be the following program; 

{ Colleague{X, Y) ■<— Colleague{Y,X) 

Colleague{a, b) <-T 
Colleague{a, c) ^ F 

If we have to send information to persons that we are sure to be colleagues 
of h, we have to choose the pessimistic or skeptical semantics. The only person 
for which we can prove she is a colleague of b is a. 

Now, if we want to send information to persons that may be colleagues of b, 
then we have to choose the optimistic semantics. There are two persons that are 
or may be colleagues of 6 : o and c. The following table summarizes the results. 



Semantics 


Coll(a,b) 


Coll(b,a) 


Coll(a,c) 


Coll (c, a) 


Coll(b,c) 


Coll(c,b) 


Fixu 


T 


T 


F 


F 


F 


F 


Fixh 


T 


T 


F 


F 


T 


T 


Fix^ 


T 


T 


F 


F 


U 


U 


Fixii 


T 


T 


F 


F 


X 


X 



* Actually, Fix^ and Fixx refer both to program V, and should be denoted as Fix^ n 
and Fix^^xi respectively. However, in order to simplify the presentation, we shall 
omit V in our notations. 
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The behavior of <P’'p with respect to the truth ordering is less simple because 
is anti-monotonic under this ordering. However, there is a modification of 
the Knaster- Tarski theorem dealing with precisely this case: 

Lemma 1 ([16]). 

Suppose that a function f is anti-monotonic on a complete lattice C. Then 
there are two elements p, and v of C, called extreme oscillation points of f such 
that the following hold: 

- p and V are the least and greatest fixed point of p (i.e. of f composed with itself); 

- f oscillates between p and v in the sense that f{p) = v and f{v) = p; 

- if x and y are also elements of C between which f oscillates then x and y lie 
between p and u. 

Under the truth ordering, is anti-monotonic and V{!FOU'lV) is a complete 
lattice, so has two extreme oscillation points under this ordering. 

Proposition 2. has two extreme oscillation points denoted Fix"^ and Fix^, 
with Fixf: <t Fix^, under the truth ordering. 

We can now extend the result of [6] to any value of FOWZ. 

Theorem 3. Let V be a Fitting program. Then we have: 

Fix^ = Fix'^ ® Fix^ , Fixj = Fix% © Fix^ , 

Fixjr = Fixy A Fixx , Fix^ = Fixf) V Fixx . 

The family of a-fixed models of a program is boimded for each a e FOUTZ 
as follows: in the knowledge ordering, all a-fixed models are between Fix^ and 
Fixx which are the least and greatest a-fixed models, respectively; in the truth 
ordering, all a-fixed models are between and Fixj- which are not neces- 
sarily a-fixed models of V. 

It is interesting to note that for a == F the first equality of theorem 3 re- 
lates two different definitions of the well-founded semantics: the left-hand side, 
Fix^, represents the definition of Przymusinski [12] via three- valued stable mod- 
els, whereas the right-hand side, Fix^ ® Fix^, represents the definition of Van 
Gelder via alternating fixed points [13]. Working with bilattices. Fitting gener- 
alized the approach of Van Gelder in [5] 8md that of Przymusinski in [6]. 

In the full paper, we present a polynomial algorithm that uses a bottom-up 
approach to compute the a-fixed semantics of a ground Fitting program V with 
no function symbol. 

4 Comparing the usual semantics of logic programs 

In this section, we compare the a-fixed models of conventional logic programs 
with the usual semantics, then we compcire the different usual semantics among 
them. 

The following theorem states that the family of stable models is included in 
the family of pessimistic fixed models (thus extending stable models from con- 
ventionnal logic programs to Fitting programs), and that the well-founded se- 
mantics and the Kripke-Kleene semantics are captured (and similarly extended) 
by our appproach. 
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Theorem 4. Let V be a conventional logic program. 

(1) If V is a three-valued stable model of V, then v is a pessimistic fixed model. 

(2) If V is the well-founded semantics ofV, then v — Fixy; 

(3) If V is the Kripke-Kleene semantics ofV, then v= Fixy. 

It is important to recall here that, in our approach, positive and negative 
information are treated separately during the computation of Fix^. This is not 
the case with the computation of Kripke-Kleene semantics. Nevertheless, when 
we restrict our attention to conventional programs, the two methods compute 
the same semantics. Om approach unifies the computation of usual semantics, 
and thus allows us to compare them. 

Theorem 5. Let V be a Fitting program. Then we have: 

Fix’ll <fc Fix^ and Fix^ <jt FixJ/. 

That is, the skeptical semantics gives less information than the pessimistic 
and optimistic semantics. From this theorem, we can infer the following result: 

Corollary 1. Let V be a Fitting program. Then we have: 

Fix^ <k Fix^ 0 Fixjf. 

The equality is satisfied for positive programs, but if we accept negation 
then it is false in general. This corollary suggests the possibility of defining a 
new semantics, namely Fix^ 0 FixJ(, that is smaller than the pessimistic and 
optimistic semantics but greater than the skeptical semantics. The following ex- 
ample shows that this semantics can be useful in certain contexts. 



Example 

V :A^ 



Semantics of V 




Fixh 


Fi^ 


Fix^ 0 Fixh 


A 


T 


T 


U 


T 


B 


F 


T 


U 


U 



The program V seems to assert that A is always true (because it is inferred 
from either B or -'B), and this conclusion is reached by both the optimistic and 
the pessimistic semantics. However, there is no reason why we should choose 
between B true and B false when we cannot assert anything about the value 
of B. It seems therefore more natural in this case to take the consensus between 
the pessimistic and optimistic semantics, which gives the value unknown to B. 

Although this seems to give the right semantics for that situation, one has 
to check under what conditions Fix^ 0 Fixjf is actually a model. Assuming 
that it is a model, we can call it consensus semantics. 



5 Conclusion 

We have defined parametrized semantics for the family of Fitting programs [6], 
and an algorithm for their computation. The family of Fitting programs is very 
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general and includes the conventional logic programs. When we restrict the class 
of Fitting programs to the class of conventional logic programs, the new seman- 
tics coincide with the conventional ones. This allows us to compare conventional 
semantics in this new setting in which they are embedded. It also allows us to 
combine conventional semantics, and thus it suggests the possibility of defining 
new semantics such as the consensus semantics that we proposed in this paper. 

Extending this work to general bilattices and logics with signs and annota- 
tions is a topic for future work. 
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Abstract. A novel form of labelled transition system is proposed, where 
the labels are the arrows of a category, and adjacent labels in computa- 
tions are required to be composable. Such transition systems provide the 
foundations for modular SOS descriptions of programming languages. 
Three fundamental ways of transforming label categories, analogous to 
monad transformers, are provided, cuid it is shown that their applications 
preserve computations in modular SOS. The approach is illustrated with 
fragments taken from a modular SOS for ML concurrency primitives. 



1 Introduction 

SOS (structural operational semantics) is a widely-used framework for defin- 
ing process algebras [12, e.g.] and programming languages [13, e.g.]. Following 
Plotkin [22], SOS has often been preferred to the more abstract framework of 
denotational semantics. The labelled trrinsition systems that provide the foun- 
dations for SOS are themselves well-studied mathematical objects, with major 
applications in software (and hardwrue) engineering. 

Modular SOS is a form of SOS that ensures a high degree of modularity: the 
transition rules for each construct are completely independent of the presence 
or absence of other constructs in the described language. When one extends 
or changes the described language, the description can be extended or changed 
accordingly, without reformulation — even though new kinds of information pro- 
cessing may be required. This is in mcirked contrast to conventional SOS, where 
modularity tends to be quite poor: when extending a pure functional language 
with concurrency primitives and/or references, for instance, the original specifi- 
cation of the transition rules had to be completely reformulated [3]. 

In denotational semantics, the problem of obtaining good modularity has 
received much attention, and has to a Icuge extent been solved by introducing 
so-called monad transformers. Modular SOS, somewhat belatedly, provides an 
analogous solution for operational semeintics. 

The basic idea of Modular SOS is to incorporate all semantic entities as 
components of labels. Thus configurations are restricted to syntax and com- 
puted values. The foundations of Modular SOS involve a novel form of labelled 
transition system (LTS), where the labels are the arrows of a category. In contrast 

M. Kutylowski, L. Pacholski, T. Wierzbicki (Eds.): MFCS’99, LNCS 1672, pp. 70-80, 1999. 
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to other frameworks where labels are equipped with categorical structure, com- 
position here is generally a partial operation, and computations are restricted 
to those where all adjacent labels are composable. Note that the labels are no 
longer the simple atomic eictions often used in studies of process algebra, but 
usually have semantic entities — e.g., environments and stores — as components; 
so do the objects of the label category, which correspond to the states of the 
processed information. 

Any arrow-labelled transition system (ALTS) can be reduced to cin ordinary 
LTS, and the usual notions of derivative and bisimilaxity lifted accordingly; a 
version of higher-order bisimulation may also be defined directly. 

Three fundamental label transformers have been identified; they preserve the 
computations specified by a modular SOS, and their order of application is irrel- 
evant. The label transformers are analogous to some simple monad transformers. 
The one which transforms the label category to incorporate new context infor- 
mation (such as the current environment) adds the same sort of component both 
to arrows and to objects, and composition preserves the value of that compo- 
nent. Also the transformer which incorporates mutable information (such as the 
current store) adds a corresponding component to objects, whereas it extends 
the arrows with a pair of such components; composition on pairs is as for binary 
relations. Finally, the transformer which incorporates emitted information (such 
as synchronization signals) adds a corresponding component to the arrows, but 
leaves the objects unchanged. 

Plan of the Paper: Section 2 starts by recalling the basic notions of SOS and LTS. 
Section 3 defines what an ALTS is, and shows how any ALTS can be reduced to a 
corresponding ordinary LTS. Section 4 provides some simple illustrations of label 
categories. Section 5 defines the three fundamental ways of transforming label 
categories to incorporate further kinds of processed information. Section 6 gives 
some illustrative excerpts from a modular SOS of ML concurrency primitives. 
Section 7 discusses the relationship of Modular SOS to other work. Section 8 
concludes by indicating what remains to be done. Proofs, eind some other details 
that have been omitted here, may be found in the full version of this paper [18]. 

2 Conventional SOS 

In the conventional SOS framework [22, 23] programs (and all their constituent 
phrases) are generally modelled as labelled transition systems: 

Definition 1. A labelled transition system (LTS) is a structure (F, T, A, — >), 
where P is the set of configiurations, T C P is the set of terminal configiurations, 
A is the set of labels, and — > <Z P x A x P is the transition relation. For 
configurations 7, 7' € F and label a G A, the assertion that (7, a, 7') is in the 
transition relation is written 7 7' (implying 'y ^T). 

A computation (from j) is a sequence of transitions 7 71 . . . , which 

is either (countably) infinite or finishes with a configuration 7' 6 T. 
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The main characteristic feature of SOS is that transitions are specified induc- 
tively, according to the syntactic structure of the described language, by rules: 



/ 

71 — 



7n 



7n 



The syntactic components of 71 , . . . , 7„ are generally sub-phrases of the syn- 
tactic component of 7. Other formulae, such as equations, may be used as side- 
conditions on rules (often listed together with the premises). The intended tran- 
sition relation is the least such relation that is closed under the given rules. ^ 

There are two distinct styles of SOS: in so-called small-step SOS, each tran- 
sition in a computation generally corresponds to an indivisible item of informa- 
tion processing, such as adding two computed numbers, or assigning a computed 
number to a variable; in big-step SOS, also known as Natural Semantics [11], a 
computation is a single transition leading directly to a terminal configuration, 
corresponding to the combination of many items of information processing. The 
two styles may be mixed in the same description, e.g., big-step for expression 
evaluation and small-step for command execution; alternatively, the transitive 
closmre of a small-step transition relation can be used to represent a big-step 
relation [22]. 

Intermediate configurations in small-step SOS generally involve an extension 
of abstract syntax, where any phrase can be replaced by its computed value. 
Let us refer to such an extended syntax as value-added. (In some languages, the 
computed values can be identified with canonical terms of the original syntax, 
so such an extension is not needed.) 

Configurations often involve familizu: semantic components, such as stores 
that map variables to their assigned values. Environments (mapping identifiers 
to their denoted values) are however usually treated as separate arguments of 
a relative transition relation p b 7 7' [11,22]; this complication can be 

avoided by using syntactic substitution instead of environments (although it 
is quite tedious to define substitution when binding constructs introduce local 
scopes for variables). Input, output, and synchronization signals axe all generally 
recorded in the labels on transitions. 

For detailed explanations of the conventional SOS framework, the reader is 
referred to [1, 10, 11, 21-24, 27]. The Icick of modularity in conventional SOS may 
be observed in many papers in the literature [3, e.g.]. 



3 Modular SOS 

Modular SOS (MSOS) is a particularly simple and uniform style of SOS. The 
essential idea is to use the labels on transitions to represent general information 
processing steps; the configurations merely keep track of the flow of control and 
computed values, and are therefore restricted to syntax and computed values 



^ A more complicated definition is needed when negations of assertions of transitions 
are allowed in premises [7]. 
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(i.e., value-added syntax, see Sect. 2). The transition relation is required to be 
ternary (7 7'), so the only place left for the usual semantic components of 

transitions (such as environments and stores) is in the labels. 

In a transition 7 the label a must itself determine the state of the 

processed information both before and after the step. Two such transitions can 
be adjacent in a computation only when the state after the first and the state 
before the second are identical. This intuition is conveniently represented by 
regarding the labels as the arrows of a category, with the states as the objects 
of the category. The foundations for MSOS are provided by such arrow-labelled 
transition systems. (Smprisingly, this appears to be a novel combination of the 
familiar notions of LTS and category.) 



3.1 Arrow-Labelled Transition Systems 

Definition 2. An arrow-labelled transition system (ALTS) is a labelled tran- 
sition system (F,T,A , — >^), where A is a category. The set of objects of A 
is written |A|. Each arrow a G A has a source object pre{a) and a target 
object post{a); each object o € |A| has an identity arrow id{o). Composition 
of arrows ai, 0:2 is written ai ; 02, in diagrammatic order, and is defined iff 
post(ai) = pre{a 2 )- The set of identity arrows of A is written or just I 
when A is evident. Let the variables i, d, ii, etc., range over I. 

A computation in the ALTS (from 'y) is a sequence of transitions 7 
7i ... , which is either (countably) infinite or finishes with a configuration 
7' € T, and moreover such that all adjacent labels a,, Qj+i in it are composable 
in the category A (i.e., the labels in a computation trace a path through A). 

Identity arrows are also called silent or unobservable-, they generally label tran- 
sitions that merely reduce the configuration, e.g., computing a new value from 
already-computed arguments, or propagating an exception. 

It is straightforward to generalize the usual inductive definition of the tran- 
sitive closme of the transition relation to ALTS: 



“ 1 .+ ® 2 ,+ 

7 — > 7i 7i — > 72 

, 4- 

7 — > 72 



a = ai ; Q2 



>+ 7' 



3.2 Reduction of ALTS to LTS 

Let (r,T,A,^) be an ALTS. We reduce it to an LTS (r*,T*,A‘,^*) by 
incorporating the states of the label category in the configurations, removing 
them from the labels, and forgetting that A is a category. 

A label a G A determines the source and target states pre{a) and post{a). 
It may however contain fmther information that is not derivable from these 
states; let us denote this remainder by emit{a). Thus for some function / we 
have a — f(pre(a), emit (a), post {a)), for all a G A. 

So let r* = T X |A|, T* = T X |A|, A* = ran{emit), and let 
that {{'y,o),emit{a),{'y',o')) G — iff (y, f{o,emit{a),o'),'y') G ■ 



be such 
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3.3 Bisimilarity 

Using the reduction from ALTS to LTS given above, we may lift the usual 
notions of derivative and (hi) similarity from LTS to ALTS. The following direct 
definition of (bi)similarity allows for a subsidiary similarity relation on labels: 

Definition 3. Let ALTS\ = (/\,Ti,Ai, — >i), ALTS2 — (T2,T2,A2, — ^2)- 
A pair of relations S C (FiX |Ai|) x x IA2I), 5 C Ai x A2 is called a strong 
simulation iff, for all 71,01,72,02, {71,01) S (72,02) implies that: 

— whenever 71 7{ with pre(ai) = oi, post{ai) = o^, then 72 72 for 

some 02,72,02 with pre{a2) = 02, post{a2) = 02, (7i,o'i) S (72,02) and 
Ql S 02," 

- whenever 71 € Ti then 72 € T2. 

The pair (5,5) is a strong bisimulation when both (5,5) and are 

strong simulations. 71 and 72 are strongly bisimilar iff there is a strong bisimula- 
tion (5, 5) such that for all Oi, 02 with id(oi) S id{o2) we have (71, oi) 5 (72, 02). 

4 Basic Label Categories 

Let us consider some simple examples of label categories. In the next section 
they are generalized to generic transformers of label categories. 

TrivCat: The category TrivCat is a category with a single object and a single 
(identity) arrow. Taking labels in TrivCat gives an ALTS corresponding 
to an unlabelled transition system where the configurations have no seman- 
tic components at all: the trcinsition relation e — > ei is essentially term 
rewriting. 

EnvCat: Let Env be a set of environments (i.e., finite maps from identifiers to 
values). Then EnvCat is the discrete category that has the environments 
p e Env both as the objects and as the (identity) arrows. Composition of two 
arrows is defined only when they are the seime environment. Taking labels 
in EnvCat gives an ALTS corresponding to an LTS with an (unlabelled) 
relative transition relation ph e — > e' . 

StoreCat: Let STORE be a set of stores (i.e., finite maps from addresses to 
values). Then StoreCat is the category with stores s G Store as objects, 
and with pairs of stores (s, s') as eirrows. Arrow composition (si, s() ; (s2, s'2) 
is defined only when = S2, and then returns (si,S2). Taking labels in 
StoreCat gives an ALTS corresponding to an (unlabelled) LTS where con- 
figurations are of the form (e,s). 

ActCat: Let Act be some set. Then ActCat is a category with a single object, 
where the arrows are finite sequences [ci,... ,a„] G Act*. Composition 
is totally defined as sequence concatenation; the empty sequence [] is the 
identity arrow. Taking labels in ActCat gives an ALTS corresponding to 
an LTS where the labels are just single actions a G Act, together with the 
unobservable action r, as is usual in studies of process algebra [12, e.g.]. 
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5 Fundamental Label Transformers 



The particular label categories given as examples in Sect. 4 may all be obtained 
by applying general label transformers to TrivCat. Each transformer corre- 
sponds to a fundamentally different way of processing information: allowing it 
to be inspected, or to be both inspected and provided, or merely to be provided. 

The transformers are defined concretely as follows, mapping a given label 
category A to a (richer) label category A'^. Let Index be the set of indices i 
that may be used to refer to components of labels. Let Univ be the universe 
whose elements represent items of processed information. 



Definition 4 (Label Transformers). 

ContextInfo{i, E) : A’^, where for any E C Univ, 

A* = Ax E 
lA^l = 1 A| X E 
= I* X E 

Note that (oi 02, ei) above is undefined whenever oi 02 is undefined, 
and similarly elsewhere. 

MutableInfo{i, S) : A>-^ A*, where for any S C Univ, 



= A X (S X S) 

= |A| X S 

I*-* = X {(s, s) I s 6 S} 

( i,( 1, 1)) » ( 2,( 2, 2)) \^undefined, otherwise 

Em,ittedInfo{i, A, f, t) : A^-i■ A^, where for any monoid A C Univ with op- 
eration f : A X A— > A and unit r € A: 



A* = Ax A 
\A*\ = |A| 

^ X {r} 

(oi, ai) («2, 0,2) = (oi 02, /(ai, 02)) 

Each transformer should also be defined to lift to A^ the evident general func- 
tions set : A X Index x Univ — >• A and get : A x Index — > Univ, which replace, 
respectively retiurn, the values of particular components in labels, independently 
of the presence or absence of other components. For MutableInfo{i,S), where 
the label components are pairs {s, s'), it is convenient to use the derived functions 
getp,.e{a,i) = TTi{get{a,i)) and setp„,t{a,i,s') = set{a,i,{ 7 ri{get{a,i)),s')). 
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A crucial property of the fundamental label transformers is that they preserve 
the computations specified by a set of transition rules. Thus to extend an MSOS 
one may first apply a label transformer — ^without changing the semantics of those 
constructs that have already been described — and then proceed to exploit the 
new component of labels in the description of new constructs. 

The preservation of computations requires that the side-conditions of rules 
are unaffected by the label transformation. In practice, disciplined use of the 
general functions set{a,i,u), get{a,i) in side-conditions of rules, as illustrated 
in the next section, ensures this property. 

For each fundamental label transformer, functors F : A-^ A*, G : A 

can easily be defined so that F followed by G is the identity. This is all that is 
needed for preservation of computations: 

Proposition 1. Let A, A' be label categories related by functors F : A A', 
G : A' A such that F followed by G is identity on A. Let sets of configurations 
F, T be given. Let R be a set of positive transition rules, such that the holding 
of side- conditions is the same for A and A'. Let — >, — >' be the transition 
relations specified by R with labels ranging over A, respectively A'. Then for 
each computation 7 7 i ... in {F, T, A, — >■) there is a computation 

7 7 i ... in (F, T, A', — >'), and vice versa. Furthermore, the labels Oj 
can be recovered by applying G to the labels a • . 

The result is proved by regarding Rasa schematic specification of a label-indexed 
family of binary relations and by using F and G to transform proofs for 
individual transitions. Prom the preservation of composition by the functors F 
and G, the desired property for computations follows. 

6 Illustrative Examples 

The following fragments are taken from a complete Modular SOS of ML concur- 
rency primitives [20]. Following [3], we describe first a purely functional fragment, 
and extend it both with references and with processes. In the original SOS, each 
extension involved a complete reformulation of the rules given for the functional 
fragment; with MSOS, no such reformulation is needed, and the extensions below 
may be made in any order. 

6.1 The Functional Fragment (excerpts) 

Exp e ::= u | a; | ei 62 j . . . 

Val V c\ (V1.V2) I Ax(e) 

No label transformers are required here since we follow [3, 25, 26] and use syn- 
tactic substitution e[x i->- u] instead of environments and closures. 




— > ^1 62 Vi 62 > Vi 62 

Xx(.e) V e[x u] 



ei 62 



( 1 ) 

(2) 
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6.2 An Imperative Extension (excerpts) 

Const c | assign | deref 

Val v::=...\l 

The label transformer Mutableinfo {store, Store) is used, where Store = 

Log ^ Val. The notation s[Z v\ below denotes the store that maps I to v, 
and otherwise maps locations /' to their values s(Z') according to s. 

s — store) I € dom{s) a — setpost{^, store, s[Z u]) 

assign (Z.u) () 

s = get preji-, store) v = s{l) 
deref Z — ^ v 



6.3 Concurrent Processes (excerpts) 



Exp 


6 ::= ... 1 sync e | spavm e 


Const 


c ::= ... 1 receive | transmit 


Val 


v:~ ...\k\ev 


Event 


ev k\v | fc? | . . . 


Progs 


p ::= 6 1 Pi II P2 



The label transformer EmittedInfo{acts, Act*, concat, []) is used, where ACT= 
(Event x Exp) U Val, [] is the empty sequence, [o] is the sequence formed from 
the action a, and concat{[ai, . . . ,am], [cti, . . . ,a'n]) = [oii • • • > aii ■ ■ ■ , a(,]- 

transmit (.k.v) — ^ k\v receive k kl (5) 

a = set{i, acts, [{ev, e)]) 

a 

sync ev — > e 



a = set{i, acts, [v]) e — > e' get{a, acts) = [t;] a' = set{a, acts, []) 



spawn v^O 



e' II (v 0) 



Pi — ^ Pi 



P2 1 P2 

Pi II P2 pi II P 2 Pi II P 2 Pi II P2 



( 7 ) 

( 8 ) 



“ 1 , ! „ “ 2 , I 

Pi >• Pi P2 1 P2 

get{ai, acts) — [(eui,ei)] get{a 2 , acts) = [(eu 2 ,e 2 )] 
k 

eui X eu 2 with ( 61 , 62 ) a = set(ai ; 02 , acts, []) 

ii “"i m I 

Pi II P2 ^ Pi II P2 

k 

The relation evi x eu 2 with (ei, 62 ) holds when the events evi and ev 2 match (on 
channel k) with respective results ei and 62 , as defined in [25,26]. For instance, 

k\v^ k? with ( 0 , u) holds. 
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7 Relation to Other Work 

This paper develops ideas first explored by the author in [17]. The technique of 
incorporating all semantic information in labels has previously been proposed 
as a general principle for SOS also by Degano and Priami [5], and exploited 
by them to obtain parametricity in the framework of Enhanced Operational 
Semantics. However, they did not abstract from the structure of labels (which 
is a crucial step for obtaining full modularity and extensibihty), nor did they 
consider partial composition of labels. The Tile Model framework of Gadducci 
and Montanari [9] provides categorical structure on labels, but is otherwise not 
closely related to the present approach. 

There has been extensive work on various formats of small-step SOS (see a 
recent paper by Fokkink and Verhoef [8] for references), but the conservativity 
results obtained there concern extensions with new syntax and rules, rather than 
changes to labels. An SOS format with terms as labels has been proposed by 
Bernstein [2], but modularity was not considered. The recent work of Turi and 
Plotkin [28] using coalgebraic techniques in SOS addresses foundational issues, 
and appears not to improve the modularity of semantic descriptions; moreover, 
the approach does not yet seem to be apphcable to the description of conventional 
progranuning languages. 

A non-structural but quite succinct approach to operational semantics is to 
give an (unlabelled) reduction semantics for applications of evaluation contexts 
C[t], following Felleisen et al. [6,29]. The use of evaluation contexts appears to 
provide some inherent modularity, but obtaining full modularity may involve 
the introduction of many artificial internal steps [4]. Reppy’s evaluation-context 
semantics for ML concurrency primitives [25, 26] has better modularity than the 
SOS given in [3]— see [20] for a detailed comparison of it with an MSOS for the 
same language. 

8 Conclusion 

The issue of modularity is significant for practical application of formal seman- 
tics. The structmal approach to operational semantics is particularly popular for 
describing both conventional programming languages and process algebras, and 
it is widely taught to undergraduates [10,21,27]. Its poor modularity was left 
as an open problem by Plotkin [22, p.64]. The approach proposed in the present 
paper provides modularity in SOS through the use of a more disciplined meta- 
notation, while retaining the full generality of Plotkin’s original framework. The 
fundamental label transformers of MSOS incorporate the standard techniques 
used in SOS, in much the same way as monad transformers in denotational se- 
mantics incorporate standard techniques for constructing domains. All this is 
obtained through a simple (yet apparently novel) combination of the familiar 
notions of labelled transition system and category. 

A higher-level approach to obtaining modularity in operational semantics, 
called Action Semantics, has previously been proposed by the author, in col- 
laboration with Watt [14-16]. It employs a rich semantic notation called Action 
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Notation, whose operational semantics was originally defined using SOS [15, 
Apps. B-C]. The lack of modularity of that SOS has hindered the definition of 
extensions or variants of Action Notation. An MSOS of Action Notation has 
recently been developed [19], and its modularity should greatly facilitate the 
reconsideration of the detailed design of Action Notation [16, Sect. 8j. 

The full MSOS descriptions of Action Notation and of ML concurrency prim- 
itives should provide sufficient evidence of the benefits of MSOS as a descriptive 
framework, and of the way that it scales up smoothly to richer languages. It 
remains to be shown that appropriate theories of semantic equivalence can be 
established using the direct definition of bisimilaxity for ALTS, and that no 
further label transformers are needed. 
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Abstract. Message sequence charts (MSC) are a graphical specifica- 
tion language widely used for designing communication protocols. Our 
starting point are two decision problems concerning the correctness and 
the consistency of a design betsed by MSC graphs. Both problems are 
shown to be undecidable, in general. Using a natural connectivity as- 
sumption from Mazurkiewicz trace theory we show both problems to be 
EXPSPACE-complete for locally synchronized graphs. The results are 
based on new complexity results for star-connected rational trace lan- 
guages. 

Keywords. Message sequence graphs, Mazurkiewicz semi-traces, au- 
tomata theory, universality problem 



1 Introduction 

A recent trend in formzd methods is the use of tools and techniques that are 
based on visual notation. Another trend is the use of standard methods, allowing 
seemless transfer of technology. Message sequence charts (MSCs) is a notation 
that has a standard visual and textuzil presentation (Z.120, see figures below). 
This notation is frequently used for specifying the design of communication pro- 
tocols. It abstracts away from e.g., the actual code, or the value of variables, 
and concentrates on the messages exchanged between the different participating 
processes. 

Analogously to systems described using finite state automata, there are nat- 
ural algorithmic problems which arise from debugging the design of communi- 
cation systems using MSCs. Such problems are related to the correctness of the 
design with respect to the specification, and its internal consistency. It may ini- 
tially seem that MSCs axe easier to cinalyze than automata based finite state 
systems, since variables and values are abstracted away. It tiurns out that this 

* The results were partly supported by Bell Labs and DIM ACS. 
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is not the case: the semantics of MSCs is based on a partial order between its 
events (in comparison with the total order model in the traditional interleaving 
semantics). Further, it does not assume any bound on the capacity of its message 
queues. In its genercd form, the MSC notation allows infinite computations by 
using MSC graphs, in which each graph node includes an MSC. 

In this paper we study two decision problems for MSC graphs, detecting race 
conditions and verifying confluence. For race conditions there is a quadratic al- 
gorithm for plain (finite) MSCs, which has been implemented e.g. in the tool 
uBET, [1,7]. A variant of the confluence problem has been considered in [2]. 
Both the specification and the execution sequences of MSC graphs are captured 
by the notion of rational trace language, which corresponds to the closure of 
regular languages under partial commutations. This easily yields the undecid- 
ability of both questions. However, we are interested in reasonable restrictions 
of MSC graphs which guarantee the decidability of verification tasks. A main 
result of trace theory [9, 10, 13] states that loop-connected automata (or star- 
connected regular expressions) are equivalent to regular languages closed under 
partial commutations. (For asymmetric partial commutations only the inclusion 
from left to right holds, [3]). This is a very natural restriction for protocols spec- 
ified by MSC graphs, too. It simply mecins that we disallow global synchroniza- 
tion (needed for discoimected components) and unbounded message sequences 
in one direction, only (without acknowledgment). This directly leads to consid- 
ering decision problems on rational trace languages specified by loop-connected 
automata. Surprisingly, this computational aspect of rational trace languages 
has deserved little attention until now. We show for example that the universal- 
ity problem for this class is EXPSPACE-complete. The same complexity bound 
follows for both MSC problems. Furthermore, we show that the connectivity 
property for automata is co-NP-complete. 



2 Preliminaries 

We first recall the notion of Mazurkiewicz (semi-) traces, [5, 8]. An independence 
alphabet is a pair {A, I), where A is an alphabet endowed by Ein irreflexive relation 
I C Ax A, called independence relation (or commutation relation). Note that we 
do not assume that I is symmetric. With a given independence alphabet {A, I) 
we associate a rewriting relation =>/ given as the reflexive, transitive closure 
of —¥j, where xaby — xbay for any contexts x,y G A* and (a, 6) G I. Let 
[a]/ C A* be the set of all v with u v, for u G A*, then [u]/ is called a 
semi-trace over the independence alphabet {A, I). Let D — A x A \ I denote 
the complementary (dependence) relation. A semi-trace [ui • • • «„]/, Ui G A, can 
be also viewed as a poset ({!,... to}, x) with i -< j whenever there is some 
dependence path i = ii < ■ < ii = j, i.e., (uin,u<,) G for all fe < 1. The 

set of semi-traces defines with the concatenation [u]/[u]/ = [ut^]/ the monoid 
of semi-traces M(A, 7). For any set L C A* we define the I -closure of L by 
[L]i = Uugi:,[u]/. A language [L]j with L regular is called rational semi-trace 
language. For a word or trace t let alph(t) denote the set of letters occurring in 
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t. The universality problem for rational languages over the independence relation 
{A, I) is the question whether [L]i = A*, for regular languages L C A*. 

By EXPSPACE we mean the complexity class DSPACE{2"°*^'). 

Definition 1. A message sequence chart (MSC) M — {E,<,V,i,S,R,M) is 
given by a poset {E, <) of events, a set V of processes, and a mapping £ : E 
that associates each event with a process (location). For each process P the set 
£~^{P) is totally ordered by <p. The event set is partitioned as E — SliR, where 
S (R, resp.) is the set of send (receive, resp.) events. Furthermore, A4 C S x R 
is the graph of a bijective mapping, relating every send with a unique receive, 
and conversely. 

Let e <c f for every pair (e, /) G Ad. It is required that the relation <c 
U(Jpg.p <p is acyclic. Then < is the partial order induced by <c ^[jp^p <p- 

The partial order < is called visual order and is defined according to the 
syntactical representation of the chart (e.g. represented according to the standard 
syntax ITU-Z 120). 

In general, the visual order provides more ordering between its events than 
intended by the designer. For example, in the visual order, the events of each 
process (represented by a vertical line) are totcdly ordered, including messages 
received by a process from different processes. However, enforcing such linear 
ordering between receive events is in general not the intended semantics of 
the system. To make this distinction, we associate with every chart a causal 
structure by means of a given semantics, which depends on the system archi- 
tecture. Formally, the causal structure associated with a chart M is given as 
tr(M) == {E, <, V, i, S, R, M), where -< C E x E is a, partial order called causal 
order. The causal order -< is defined as the partial order induced by an acyclic 
relation denoted precedence relation. The precedence relation is defined by 
a set of rules that state which pairs of events ordered by the visual order also 
belong to the causal order. We give below the set of rules corresponding to an ar- 
chitecture where the commimication is asynchronous and first-in-first-out {fifo). 
For this semantics we only consider charts where the visual order satisfies for 
any events e, /, e', f and processes P, P': 

e<cf, e<pe', £{f) = £{f) = P' ^ f <p' f ■ 

Let M = {E,<,V,i,S,R,M) be a chart and let e^-f according to the fifo 
semantics if one of the following conditions holds: 

• A send and some event on the same process: {e, /} n S 0 and e <p / for 

some P. 

• A message pair: e <c /, i.e., (e, /) G M. 

• Messages ordered by the fifo queue: {e, f}QR,e <p f for some P and the 

unique e', /' with e' <c e, f <c f satisfy e' <pi f for some P'. 

Infinite behaviours can be specified using MSC graphs (or alternatively, hi- 
erarchical MSC graphs, HMSC). 
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Definition 2. An MSC graph M = {S, — )■, sq, c, V) is given as a finite, directed 
graph (5, ->,Sq) with source state Sq € S and nodes labelled by the mapping c, 
assigning to each state s a finite chart c{s) over V. 

Given two charts Mi = {Ei, <i,V, £i, Si, Ri, Mi) over a common process set 

V let the concatenation of Mi, M 2 be the chart Mi M2 = {E\ 

5i U52,f2i Uii2, A4i UAf2), where < = <i U <2 x £ 2 ^{P). The 

infinite concatenation Mi M2 ■ • • is defined correspondingly. The concatenation of 
the associated causal structures is defined as tr(MiM2 • • • ) — tr(Mi)tr(M2) • • • . 
Each path (si, S2 , . . . ) in an MSC graph M defines a (possibly infinite) MSC 

by concatenation, c(si)c(s2) A maximal path is simply a path starting with 

the source and having no proper extension in M. We denote the causal structure 
associated with a path x by tr(x). 

Both partial orders of MSCs, the visual and the causal order (under the fifo 
semantics), correspond exactly to semi-traces. With each set V = {Pi, ... ,P„} 
of processes, we associate a set A = {sij,rij \ l<i^j<n} of actions. 
Letters in A express the t3rpe (send/receive) and the location of each event e, 
together with the location of the event / such that (e, /) G M or (f,e) e M- 
Let (e, /) 6 A4 be a message from Pj to Pj, i.e., e G S, f G R, £(e) — Pi and 
£{f) = Pj. We define a labelling X : E A by letting A(e) = Sij and A(/) = rij, 
respectively. For a chart M with event set E let msg(M) = {A(e) | e G E}. 

Consider a chart M and its causal structure tr(M). The visual order is easily 
seen to be induced by the dependence alphabet {A, Dy) given by the dependence 
relation D„ = {(s<_,-,rij) | i,j}^ {{sij,Sik),{sij,rki),{rkuSij),{rik,rjk) \ ij,k}. 
The causal order imder the fifo semantics is induced by the dependence alphabet 
{^A, Dc) with Dc — {(^ii'i^ij) I ifj} U {(sjj, Stfc)» (sij, r^j), (r^j, Sjj), (rjj,rjj) | 
i,j,k}. We denote in the following by Ic, the complementary relations A x 
A\Dc, Ax A\Dv 

In the trace setting, an MSC graph M = (5, — sq, c, V) is just a transition 
system with nodes labeled by some peirtial order over the set of events. 

Proposition 1. Let M = {S,^,sq,c,V) be an MSC graph over the process set 

V — {Pi, . . . , P„}. Let A = {sij,rij \ l<i^j<n} and let Dc, D„ C A x A 
be defined as above. Then a nondeterministic, polynomial- size automaton Am — 
{Q,A,5,qo,Q) can be constructed such that: 

i) [L{Am)]i„ = {•^(c(O) 1 1 a maximal path in M}. 
a) [L(Am)]/« = {-^(tr(O) \ ^ is a maximal path in M}. 

3 Detecting Race Conditions and Testing Confluence 

3.1 Race conditions 

An MSC M means just a specification of some scenario. By definition, its causal 
structure tr(M) possibly allows more executions than the specification. In this 
case we speak about race conditions. Clearly, race conditions should be avoided 
and verifying the absence of races belongs to the correctness check of the design. 
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The next figiure shows races on the process C (between the receive events from 
A resp. B): 
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We denote in the following the set of all linearizations of the visual order of 
an MSG M by lin(M). The set of executions of an MSC M, denoted exec(M), 
is the set of all linearizations of the causal order of tr(M). The two notions are 
extended to MSC graphs M, letting lin(M) C E°° (exec(M) C E°°, resp.) be 
the set of linearizations (executions, resp.) of maximal paths in M. By definition 
we have Un(M) C exec(M) for every MSC M. If the inclusion is strict, then we 
say that M contains race conditions. 

The problem of checking whether an MSC graph contains races is naturally 
related to a question on closures of regular languages. This closure problem will 
be shown below to be undecidable. Later, we will obtain that the race problem 
itself is imdecidable. 

We denote the following question as closure problem over A, Ii, I 2 . Given two 
commutation relations Ii,l 2 C A x A such that Ii C I 2 , and a regular language 
L C A*. Then we ask whether [L]/^ = 

Proposition 2. The closure problem over A, Ii,l 2 , with{A,Ii) = a c b 

and I 2 = A^ \ id^ is undecidable. 

Proof. By a reduction from the universality problem for rational trace languages 
over {A,Ii), which was shown to be imdecidable in [14]. 

Let L C A* he regulcir. Then [L]j^ = A* if and only if [L]/^ = A* and 
[L]/i = [Lj/j. Since I 2 is the total commutation relation, [L]/j = A* is an equality 
test between two semilinear sets, hence decidable by [6]. Thus, [L]i^ = [Lj/j is 
undecidable. 

Remark 1. We can state the above result more precisely. The universality prob- 
lem for rational trace languages over (A, I) is decidable if and only if / U id^ is 
transitive, [14]. Moreover, rational sets over a transitive independence relation 
form a boolean algebra and axe recognized by a particular kind of automata. 
Using this characterization we can show that the closure problem over A, I \ , I 2 
with I 2 = A“^\ id^ is decidable if and only if Ii is transitive. 

The proof of the following theorem follows easily and is omitted. 



Theorem 1. The race problem for MSC graphs under the fifo semantics is un- 
decidable. 
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3.2 Checking confluence 

An MSC graph uses non-deterministic branching to express alternative behav- 
iors. This might be a problem for the implementation, since each computation 
should correspond to a single flow of control. A possible solution is to synchronize 
processes. However, global synchronization is not desirable and it can be avoided 
if we require that the execution of the MSC graph is confluent, as defined below. 
Intuitively, confluence corresponds to the following property: suppose that two 
finite preflxes of computations are consistent in the sense that they can be com- 
pleted into a single computation. Then, there exists in the system’s description a 
complete computation that indeed includes both prefixes. For example, consider 
the case where two different protocols are initiated by different processes, e.g., 
in the first protocol (but not in the second), process Pi sends a message to P2, 
and in the second protocol (but not in the first), process P3 sends a message 
to P 4 . Since the two protocols, so fctr, axe compatible, then confluence imposes 
that there is an execution that contains both messages, possibly leading to the 
need to resolve the conflict between future behavior of the protocol. Failing to 
have the latter execution would mean that there is some additional way that 
the processes might have learned which of the two protocols to execute (perhaps 
by presetting some values). Thus, although this might not be a mistake, the 
confluence test that we discuss in this section is intended to alert the designer 
of such a possible problem. The next figure shows a confluent MSC graph: the 
possible executions have no upper bound. Omitting the acknowledgement from 
C to A would mcike the graph non-confluent, since there is no common extension 
of both executions. 




We denote throughout this section the prefix order on causal structmres (semi- 
traces, resp.) by C. That is, let tr(M) C tr(iV) if tr(i\T) = tr(M)tr(M') for some 
chart M' . The least upper bound of tr(M) and tr(AT) with respect to the prefix 
order is denoted tr(M) U tr(iV) (if it exists). 

An MSC graph M = {S,-¥,sq,c,V) is called confluent if for any maximal 
paths S°° in M such that tr(a)Utr() 9 ) exists, there is some maximcil path 

7 € S°° in M such that tr(a) C tr(7) and tr(/?) C tr(7). In terms of partial 
commutations, we have the following question. Consider a regular language L C 
A* given by a finite automaton A and a commutation relation I C Ax A. Then 






Message Sequence Graphs and Decision Problems on Maizurkiewicz T> 2 ices 



87 



we denote L {A, resp.) as confluent over {A, I) if for any x,y G L such that 
[x]i U [y]i exists, there is some z G L with [x]/ C [z]j, [y]/ C [z]/. The next 
proposition states that the confluence problem for partial commutations is in 
general undecidable. 

Proposition 3. Let {A, I) be defined by A = {a, b, c, d} and the dependence 

relation (A, Ax A \ I) = a b c d. Then it is undecidable whether a 

given regular language is confluent over (A, I) . 

Proof. We use again the undecidability of the imiversality problem, i.e., the 
question whether [L\j — B*, where B = {a, 6,c}, L C B* is regular and J — 
{(a,c),(c,a),(6,c),(c, 6)}. 

We first define an encoding h : B* B* by h{a) = ab, h{b) = ba and 
h{c) — c. Let K — h{L)b‘^d? + + h{B*)b‘^d. Suppose first that [L]j = B*, 

then [K\j = h{B*){b“^cP + b^d) + c*cP is confluent. For the converse let K be 
confluent and consider u G (ab + ba)*, v G c*. Then {ub'^vd,v(f} C [K]i. 
Moreover, if [u'b'^v'd?\j G [h(L)6^ d^]/ is an upper bound of both [ub‘^vd\i and 
[vdP]i, where u' G {a, 6}*, v' G c*, then u — u' and v = v' . This implies 
[h{L)]i = [(ah + 6a)*c*]/, hence \L]j = {a, h,c}*. 

Similarly to Thm. 1 we also obtain: 

Theorem 2. The confluence problem for MSC graphs under the fifo semantics 
is undecidable. 

4 Loop-connected Automata and Restricted MSC 
Graphs 

Basic decision problems for MSC graphs as detecting races, checking confluence 
or matching without gaps (a kind of model-checking) [12] me undecidable due 
to the connection to rational semi-trace languages, i.e., languages of the form 
[L]i, where L C A* is regular and / C A x A is an independence relation. This 
class of languages strictly includes the class of regular, /-closed languages and 
contains also non-regular languages, in general (cf. Thm. 3 below). However, the 
expressiveness of rational semi-trace languages is based on the iteration of non- 
cormected expressions, i.e., subexpressions K* where alph(/f) = U„gifalph(u) is 
not a strongly connected subgraph of the dependence relation (A, A x A \ /). If 
we disallow this possibility, that is, if we restrict the Kleene iteration to strongly 
connected expressions, then rational semi-trace leinguages remain regular and 
enjoy all good (decidabihty) properties of regular languages. We consider in this 
section loop-connected automata A as defined below. We give an exponential up- 
per bound on the size of a nondeterministic automaton for [L{A)]i and we show 
that the universality problem for connected rational languages is EXPSPACE- 
complete. Using similar arguments, we can also show that the problem of the 
nonempty intersection of connected rational languages is PSPACE-complete, i.e., 
the question whether [L{A)]i r\[L{B)\j ^ 0, [11]. These results show that partial 
commutations in general yield an exponential blow-up in the complexity of finite 
automata. 
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Definition 3. Let (A, I) be an independence alphabet, D = Ax A \ I and con- 
sider a finite automaton A = {Q,A,S,J,F). We denote A as loop-connected 
over {A, I) if for every loop {qo,ao,... ,ak-i,qk = 9o)> Qi & Q, ai e A, the set 
alph{ao • • • Ofe-i) C A of letters occurring in the loop induces a strongly connected 
subgraph of {A, D) . 

Theorem 3 ([4]). Let {A, I) be an independence alphabet and A a loop-connected 
automaton over {A, I). Then [L(^)]/ is a regular language. 

Before considering the complexity of testing whether an automaton is loop>- 
coimected (with (A, I) part of the input) let us note that it does not suffice to 
check the property on simple loops, only: 

Example 1. Let {E,I) = b a c. Let A = {Q,E,S,qo,F) with Q = 

{9o.9i.92,93}i<5 = {(9 o,a,gi). (gi. a, qn). (<?!,*>, 92)1 (92>b,gi).(go.c,g3),(g3.c,go)}) 
F — {go}- Every simple loop of A is connected. However, [L{A)\i is not regular. 
Note that the intersection of [L{A)\i with the regular language [a*{bbcc)*]i is 
the /-closure of {a^"(66cc)”* | n > m}, which is easily seen to be not regular. 

Proposition 4. The following problem is co-NP-complete: 

Instance; An independence alphabet {E, I) and a finite automaton A. 
Question: Is A loop-connected over {E, /) ? 

Proof. For the co-NP-hardness, assume that F = Ci A ■■■ A Cm is a boolean 
formula in CNF over the variable set {xi, ... , x„}. Moreover, let Cj = Iji V lj 2 V 
lj 3 , with Ijk literals. 

Let E = {ajk,bjk | 1 < j < m,A: = 1,2, 3} U {cj,dj | 1 < j < m}. The 
symmetric dependence relation is given by the following picture: 




The automaton A associated with F has n 1 states go, • . • , gn- There is an 
edge from qo to qi labeled ci • ■ • Cmdi •■■dm- For each i > 0 there are two edges 
from qi to qi+i labeled Pi, resp. iVj. We define Pi = pi ■ • -pm, N) qi - --qm 
where pj — ajk, if Xi = Ijk and qj = bjk, if x7 = Ijk, otherwise pj = qj = e. The 
main idea is that every (simple) loop from go to g„ and back to go corresponds 
to an assignment of the variables {Pi meams true, Ni false). Moreover, the simple 
loop associated with an assignment is connected if and only if the assignment is 
non-satisfying. 
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4.1 Decision problems for loop-connected automata 

For our previous undecidability results on MSC graphs (Thms. 1, 2) we used 
the universality problem for rational trace languages, i.e., the question whether 
[L]i = A*, where L <Z A* \s regular. If L is given by a loop-connected au- 
tomaton A over {A, I), then by Thm. 3 we get decidability. In this section, we 
exhibit a precise complexity characterization by showing that the universality 
problem for rational semi-trace languages (i.e., I is not necessarily symmetric) 
is EXPSPACE-complete. This leads later to the same complexity bounds for 
detecting races and checking confluence. For u € A* we denote below by I{u) 
the set {a & A\au =>/ ua}. 

The algorithm for the universality problem is based on an exponential bound 
on the size of the automaton accepting [L{A)]i- The key lemma is a classical 
automaton construction. Note that an exponential increase of the number of 
states is unavoidable. Consider for excunple the finite language = {aa -|- 66)" 

over A — {a, 6, d, 6}, and {A, A x A \ I) — a 6 a 6. Then [L„]/ fl 

{a, 6}*{d, 6}* = {uu I u € {a, 6}"}. The latter language is known to require at 
least 2" states for a minimal NFA recognizing it. 

Lemma 1. Let (A, I) be an independence alphabet and consider a loop- connected 
automaton A = {Q, A,S,qo,F). Let n = |Q| denote the number of states of A- 
Consider some word v € A* such that S{q,v) ^ 0 for some state q. Moreover, 
assume that v = toui ■ • • tk-iUktk for some Ui tj ^ I < i < k, I < j < k 
and {ti,Uj) 6 I for all i < j. Then we have k <{n- 1)(|A| -f 1). 

Proof. Let Aj denote for each 0 < j < k the alphabet Aj = L{uj+i ••■Uk). Then 
Ao C Ai • • • C Ak-i C A. Suppose by contradiction that k > {n — 1)(|A| -I- 1). 
Then we obtain some indices 0 < i < j < k such that Aj = Aj+i = • • • = 
Aj and j — i > n — 1. Thus, we have (tj • • • • ■ u^+i) E I. Let us also 

fix some computation p of A on v, i.e., a path from some state g to a state 
q', which is labeled by v. Let qi denote the state reached on p after reading 
toU\ ■ ■ • ti-iUi- With j — i > n — 1 we obtain some i < I < m < j such that 
= Qm- Therefore, Um+i ■ ■ -tm-iUm is the labelling of a loop of A. However, 
(t; ■ • ■ uj+i • • ■ u,n) € /, thus A is not loop-connected, contradiction. 

Proposition 5. Let (A, I) be an independence alphabet and A = (Q, A, S, go, P) 
a loop- connected automaton with n = \Q\. Then a finite automaton B with (n^ • 
2AI)"AI+"+i slates exists such that L{B) = [L(A)j/. 

Remark 2. We stated the above upper bound only for finite strings. However, 
the same arguments also apply to nondeterministic Biichi automata. 

Proposition 6. Given an independence alphabet {A, I) and a loop-connected 
automaton A over {A, I). We can check in EXPSPACE whether [L{A)]i — A*. 

4.2 Lower Bounds 

We have also a matching lower bound for all problems considered previously. 
The proof idea is based on a construction used by Walukiewicz [15] for proving 
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lower bounds on the temporal logic LIVL. We omit the proof for lack of space 
and refer to the full version. 

Proposition 7. The universality problem for loop-connected automata is hard 
for EXPSPACE. This holds even if the independence alphabet is fixed to {A, I), 
with A — {a, b, c, d, $, #} and I the least symmetric relation containing {a, 6} x 
{c, d} U {a, 6} X {#} U {c, d} x {$} . 

Theorem 4. The universality problem for loop-connected automata is complete 
for EXPSPACE (even if the independence relation is fixed). 



4.3 The race problem and the confluence problem revisited 

The classes of MSC graphs considered in this section are defined by restrict- 
ing loops to be strongly connected w.r.t. the visual (causal, resp.) order. This 
restriction is natural for two reasons. First, it means that we disallow global 
synchronization. (That is, iterating in paredlel disjoint sequences of messages 
would require some global coimting mechanism which is not natural in a concur- 
rent setting). Second, it corresponds to finite state systems. That is, we disallow 
e.g. imbounded computations where P repeats sending a message to P' (without 
waiting for an answer). 

Recall that A = {sij,rij | 1 < i 7^ j < n} denotes the set of actions, and 
that the visual and causal order, respectively, axe induced by the independence 
relations Iv,Ic (dependence relations D„,Dc). 

Definition 4. Let M = (S,-¥,so,c,V) be an MSC graph. We say that M is 
locally synchronized with respect to the visual order ( causal order, resp.) if for 
every loop (si,S2 ,--- ,Sk = si) in M, the set ujLimsg(c(si)) C A induces a 
strongly connected subgraph of{A,D„) ({A,Dc), resp.). 

Proposition 8. The race problem under the fifo asynchronous semantics over 
MSC graphs which are locally synchronized w.r.t. the visual order is decidable in 
EXPSPACE. 

Proof. By Prop. 1 it suffices to consider two independence alphabets (A, Ji), 
(A, 12) with Ii C I 2 , and a loop-connected automaton A (w.r.t. (A, /i)). It is 
not hard to see that [L(A)]/i ^ [L(A)]/j if and only if some words u,v e A* and 
(a, 6) 6 I 2 exist such that uabv e [L(A)]/i, but ubav ^ [L(A)]/i. By Prop. 5 the 
automaton B accepting [L(A)]/j has ^ states, hence it is possible to guess 

u, V and simultaneously store the sets of states of B reached on uabv, resp. ubav, 
using exponential space. 



Proposition 9. The confluence problem under the fifo asynchronous semantics 
over MSC graphs which are locally synchronized w.r.t. the causal order is in 
EXPSPACE. 
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The proofs for the lower bounds for MSC graphs axe similar to Sect. 3. They 
become more technical since we have to encode the independence alphabet of 
Prop. 7. For the race problem we use the fact that the language L C A* defined 
in Prop. 7 always satisfies [L]/j = A*. We omit the proofs for lack of space and 
refer to the full version. 

Theorem 5. Detecting races for MSC graphs which are locally synchronized 
w.r.t. the visual order is EXPSP ACE-complete. 

Theorem 6. The confluence problem for MSC graphs which are locally synchro- 
nized w.r.t. the causal order is EXPSPACE-complete. 

Acknowledgment: The authors thank Volker Diekert for an improvement of 
the proof of Prop. 2. 
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Abstract. The problem of computing the Hilbert basis of a linear Dio- 
phantine system over nonnegative integers is often considered in auto- 
mated deduction and integer programming. In automated deduction, the 
Hilbert basis of a corresponding system serves to compute the minimal 
complete set of associative-commutative unifiers, whereas in integer pro- 
gramming the Hilbert bases are tightly connected to integer polyhedra 
and to the notion of total dual integrality. In this paper, we sharpen the 
previously known result that the problem, asking whether a given solu- 
tion belongs to the Hilbert basis of a given system, is coNP-complete. We 
show that the problem has a pseudopolynomial algorithm if the number 
of equations in the system is fixed, but it is coNP-complete in the strong 
sense if the given system is unbounded. This result is importemt in the 
scope of automated deduction, where the input is given in unary and 
therefore the previously known coNP-completeness result was unusable. 
Moreover, we prove that, given a linear Diophantine system and a set 
of solutions, asking whether this set constitutes the Hilbert basis of the 
system, is also coNP-complete in the strong sense, answering this way an 
open problem formulated by Henk and Weismantel in 1996. Our result 
also allows us to solve another open problem, formulated by Edmonds 
and Giles in 1982, where we prove that asking whether a given set of 
vectors constitutes the Hilbert basis of an unknown linear Diophantine 
system, is coNP-complete in the strong sense. 



1 Introduction and Summ 2 iry of Results 

The Hilbert basis of a homogeneous system of linear Diophantine equations 
over the noimegative integers is the set of all non-zero vectors that axe minimal 
solutions with respect to the pointwise ordering. This set forms a basis of the 
space of solutions of the system, that is, every solution can be written as a 
nonnegative hneax combination of vectors from the Hilbert basis, and no vector 
of the Hilbert basis can be expressed as a positive hnear combination of other 

* Peirt of this work wea done while the first author weis a lecturer (ATER) at lUT 
A of the Universite Nancy 2, Prance. The full version with proofs is available at 
URL = http://www.loria.fr/~hermann/publications/recog.ps.gz. 
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vectors. Moreover, the Hilbert basis is always finite and unique. The concept of 
a Hilbert basis was studied as early as the second hailf of the 19th century by 
Gordan [Gor73] and Hilbert [Hil90]. Since that time, it has received considerable 
attention in linear algebra and integer programming. 

Computing the Hilbert basis of a homogeneous system of linear Diophantine 
equations over nonnegative integers has turned out to be one of the key prob- 
lems in automated deduction. Its importzmce in this Eirea emerged through the 
work of Stickel [StiSl], who designed the first algorithm for imification in the 
presence of associative-commutative (AC) function symbols. Stickel showed that 
the minimal complete set of unifiers of a simultaneous elementary AC-unification 
problem can be obtained from the Hilbert basis of an associated homogeneous 
system of linear Diophantine equations over nonnegative integers. This naturally 
invokes the question on the complexity of counting the cardinality of the Hilbert 
basis. In integer programming, Hilbert bases are strongly related to total dual in- 
tegrality. Universal test sets of integer programs can be constructed from Hilbert 
bases. Hilbert bases play also an important role in vcirious fields of mathematics, 
hke combinatorial convexity, toric varieties, and in polynomial rings and ideals 
(see [Sch86] for an excellent overview). 

Following the articles showing its importsmce, researchers in the fields of au- 
tomated deduction and integer programming became interested in algorithms 
computing the Hilbert basis of a system. Huet [Hue78], as well as Clausen and 
Fortenbacher [CF89] described algorithms working for one equation. These pa- 
pers assume that the computation of the Hilbert basis of a system can be reduced 
to successive computations of the Hilbert basis of single equations, interlaced 
with substitutions of the result into the rest of the system. This approach entails 
an exponential blow-up dining the tremsformation. Several researchers, includ- 
ing Contejean and Devie [CD94], Lankford [Lan89], Domenjoud [Dom91], Henk 
and Weismantel [HW96], have also developed direct algorithms for computing 
the Hilbert basis of systems with an cirbitrciry number of equations. 

We may ask several questions in connection with the Hilbert basis and in- 
teger programming problems. These questions represent several variants of the 
Hilbert basis recognition problem. Usuedly, the upper bound of the Hilbert basis 
cardinality counting problem is determined by testing whether a candidate for 
a solution belongs to the witness set. If the membership in a counting class is 
determined, we may iisk whether the membership of the Hilbert basis cardinality 
problem cannot be showed for a lower clciss. The emswer to this question is given 
by the complexity imalysis of the problem whether a solution s of a homogeneous 
linear Diophantine system 5 over nonnegative integers belongs to the Hilbert ba- 
sis of S. This problem was already considered by Sebo [Seb90], and by Henk and 
Weismantel [HW96], where they show that the problem is coNP-complete. How- 
ever, the coNP-completeness proof is done in both cases by a reduction from a 
pseudopolynomial algorithm. This is a problem when the coefficients are given 
in unary notation. Indeed, when the Hilbert basis is computed for associative- 
commutative unification, the coefficients are written in unary notation since the 
underlying AC-unification problem in automated deduction is always given in 
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unary. In this paper, we properly ancdyze the complexity of recognizing minimal 
solutions of homogeneous linear Diophantine systems S over nonnegative inte- 
gers when their coefficients are written in unary notation. We also analyze the 
case when the number of equations in a system S is bounded. 

There are two subsequent natural questions for Hilbert basis recognition 
that we analyze in this paper. The first problem, given a system S and a set of 
solutions C of S, asks whether C forms the Hilbert basis of S. The complexity 
of this problem was left as an open question in [HW96]. The second problem 
is just a generalization of the previous one. Given a set of integral vectors C, 
it asks whether C constitutes the Hilbert basis for an unknown system. This 
problem was proved to be in coNP by Edmonds and Giles [EG82], but it was 
unknown whether it is coNP-complete (see Schrijver [Sch86] for an overview 
and references). In this paper, we completely settle the complexity of these two 
decision problems. 



2 Basic Notions and Definitions 

We assume that the reader is familiar with some basics of computational com- 
plexity and integer programming. Additional material on these topics can be 
found in the monographs [Pap94,Sch86]. 

A homogeneous linear Diophantine system over nonnegative integers is a 
system of equations S: Ax = 0, where A = {al)^ is a k x n integer matrix 
and X = (xi,... ,Xn) is a vector of variables over nonnegative integers. We 
say that a solution s of 5 is nontrivial if it is different from the all-zero solution 
(0, . . . ,0). We say that a solution s = (si, . . . , s„) of 5 is smaller than a solution 
s' = (s'l, . . . , s(j), and write s < s',if s ^ s' and, for alH = 1, . . . , n, the relation 
Si < s'i holds. The relation < is called the pointwise ordering on solutions. 
A solution s is minimal if it is nontrivial and there is no smaller nontrivial 
solution s", i.e., s" < s is false for every nontrivial solution s" of S. 

The Hilbert basis H{S) of the system S is the set of all minimal solutions 
of S. This set is indeed a basis for the space of nontrivial solutions of 5, since 
no minimal solution can be expressed as a positive linear combination of the 
other minimal solutions, whereas every nontrivial solution can be expressed as 
a positive linear combination of minimal solutions. The Hilbert basis H{S) is 
finite and is the unique basis of the space of nontrivial solutions of S. 

In this paper, we are essentially concerned with the computational complexity 
of deciding whether a given solution belongs to the Hilbert basis H{S), whether 
a given set of solutions C constitutes the Hilbert basis H{S) of a given homoge- 
neous linear Diophantine system S: Ax = 0, and whether a given set of integral 
vectors C constitutes the Hilbert basis of cin unknown system S. 

To prove lower bounds of the considered problems, we need NP-complete 
problems from which we perform a polynomial-time reduction. We will use the 
following two NP-complete problems (see [GJ79]). 
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PARTITION 

Input: Finite set A of positive integers a GlA . 

Question: Is there a subset A' Q A such that X)aeA' “ = Sa€A-A' “ ^ 

Note that PARTITION remains NP-complete even if the elements in A are ordered 
as oi > 02 > • • • > 02 n and A' is required to contain exactly one of each pair of 
consecutive elements 02 t-i) a 2 t> for each t = 1 , . . . , n. 

However, PARTITION can be solved by a pseudopolynomial algorithm. This 
means that PARTITION can be solved in polynomial time if the values in A are 
given in unary notation. We need an NP-complete problem in the strong sense if 
we want to prove completeness results even if the input of our problems is given 
in unary. The following problem is NP-complete in the strong sense. 

3- PARTITION 

Input: Set A — {ai, . . . , 03 ™} of 3m positive integer elements Oj G lA and a 

bound B , such that B/A < ai < B/2 and ai -I h asm — mB. 

Question: Can A be partitioned into m disjoint sets .4i, ^42, . . . , Am such that 
5 ^o€A- ® for each i = 1, . . . , m? 

In the sequel, we consider the following decision problems. The first problem 
checks whether a solution of a given system 5 belongs to the Hilbert basis H{S). 
This problem is related to counting the cardinahty of the Hilbert basis of a given 
homogeneous linear Diophantine system over nonnegative integers. 

MINIMAL SOLUTION 

Input: Homogeneous linear Diophantine system S : Ax = 0 over nonnegative 
integers and an integral vector s. 

Question: Is s a minimal solution of the system 5? 

We denote by minimal soLUTlON(fc) the instance of the decision problem with 
a fixed number k of equations in the system S. 

The second problem checks whether a given set of solutions C equals the 
Hilbert basis H (5) of a given system 5. 

HILBERT BASIS CHECKING 

Input: Homogeneous linear Diophantine system 5: Ax = 0 over nonnegative 
integers and a set of integrcd vectors C. 

Question: Is C the Hilbert basis of 5? 

This problem is essentially the same as the Hilbert basis problem (HBP) formu- 
lated in [HW96], whose complexity was left open. 

The third problem checks whether a given set of integreil vectors C constitutes 
the Hilbert basis of an unknown system. 

HILBERT BASIS RECOGNITION 
Input: Set of integral vectors C. 

Question: Is C the Hilbert basis of some homogeneous linear Diophantine sys- 
tem? 

The third problem is known to be in coNP [EG82], but its exact complexity was 
unknown. 

We must make clear what we mean by the size of the input in the above 
decision problems. This involves the question whether the coefficients of the 
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system S: Ax = 0, in the solution s, and in the set of solutions C are all 
written in unary or binary notation. Note that equational imification problems in 
automated deduction are given in imary notation, i.e., each monomial ax coimts 
for a occurrences of the variable x, since the inputs in automated deduction 
are considered to be terms over the alphabet of variables and function symbols. 
Since om decision problems are derived from similar problems in elementary 
AC-unification, it is quite natural to assume that the coefficients of the linear 
Diophantine systems are written in imary. However, in integer programming, 
coefficients of hnear systems are usually written in binary. For this reason, we 
consider in the sequel both variants of the mentioned Hilbert basis recognition 
problems. The size of the system Ax = 0 is kna in unary notation and kn log a 
in binary notation, where a is the maximum absolute value of the coefficients in 
the k X n matrix A. An upper bound for a problem given in binary holds also 
for the same problem written in unary. Similarly, a lower bound for a problem 
given in unary holds also for the same problem written in binary. 



3 Recognizing Vectors of the Hilbert Basis 

In this section, we investigate the complexity of recognizing elements of the 
Hilbert basis. This problem was alreaidy considered by Sebo [Seb90] and by Henk 
and Weismantel [HW96]. Both mention the result that minimal solution(I) 
in binary is coNP-complete. In the full version, we give a new simpler proof 
of the lower bound from partition with linear order. We form the equation 
Xitti + (1 — X\)a2 + • • • + r„02n-l + (1 ~ ^n)<^2n = (1 ~ 2^l)oi + Xitt2 + • • • + 
(1 -x„)a 2 n-i +Xna 2 n, that has a solution from {0, 1} if and only if the instance 
of partition has one. From this we derive another equation 2^"^^ Xi{a 2 i-i — 
o- 2 i) — y 2Zr=i(‘^2i-i — <i 2 i), that has adways a solution for y = 2. It has a smaller 
solution, for y = 1, if and only if the previous equation has a solution from {0, 1}. 

A natural question is to ask what happens when the previous problem is 
written in unary notation. If the problem remained coNP-complete also in the 
unary notation, this would mean that we used a problem not strong enough 
to prove the lower boimd. On the other hand, the considered problem can be 
really pseudopolynomial. We can enlarge this question to any fixed number of 
equations, asking whether the decision problem MINIMAL SOLUTlON(fc) given in 
unary can be solved in polynomial time for any fixed k. 

Theorem 1. minimal SOLUTlON(fc) in unary notation can be solved in polyno- 
mial time for any fixed k. 

Proof. (Hint) Checking whether s is a minimal solution of S can be reduced 
to the problem of checking whether s is a solution of 5, followed by a check 
whether there exists a solution of a subsequent non-homogeneous system with 
bounded positive coefficients. The number of bounded systems to be checked is 
polynomial. □ 
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The situation changes radically if there is no bound on the number of equa- 
tions in the system S: Ax = 0 with the coefficients of A written in imary. The 
following theorem shows that we cross the tractabihty boundary in this case. 

Theorem 2. minimal solution is coNP -complete in the strong sense. 

Proof. (Hint) The lower bound is derived from 3-partition. We encode the 

3-partition by a union of fom systems: Si = {aix{ -j- a 2 x\ -H h ~ 

By I i = 1, . . . , m}, 52 = {x\+ x ^2 + ■■■ + ^ 3y \ i = 1, ... ,m}, S 3 = 

{xj -\-xf -i h a:™ = y I i = 1, . . . , 3m}, 54 — {zi -t- (m - l)z 2 = y}. The imion 

5i U 52 U 5a U 54 has always a solution for y = m.lt has a smaller solution, for 
y = 1, if and only if the instance of 3 -partition has one. □ 

Remark 1. The complexity of the considered problems minimal solution(I) 
in binary, MINIMAL SOLUTiON(fc) in unary for fixed k, and minimal solution 
remains the same even if the given vector s is known to be a solution of the 
system 5. 

4 Checking the Hilbert Basis 

The results of the previous section natmally extend to problems where we check 
whether a set of vectors C is a subset of the Hilbert basis H{S) of a given 
system 5. In this section, we investigate the question whether the set of vectors C 
equals the Hilbert basis H (5) for both cases when the system 5 is known as well 
as when 5 is unknown. The complexity of the first problem was left open by 
Henk and Weismantel in [HW96]. 

Theorem 3. HILBERT BASIS CHECKING is col^P -complete in the strong sense. 

Proof. (Hint) The lower bound is derived from MINIMAL solution. The sys- 
tem 5 is enlarged by a new system Bx = 0, computed from the matrix A and 
the vector s, so that s becomes the unique minimal solution of the enlarged 
system. □ 

Note that the HILBERT BASIS CHECKING problem remains coNP-complete in the 
strong sense even if C is known to be a set of solutions of the system 5. 

At last, we consider the HILBERT BASIS RECOGNITION problem stated by 
Edmonds and Giles in [EG82], where they showed that the problem is in coNP. 
We will prove a tight lower bound for this problem. 

We introduce the concept of the canonical form of an integral matrix, to be 
able to compare homogeneous linear Diophantine systems over nonnegative inte- 
gers. Each integral matrix A can be seen as a set of integral vectors represented 
by the rows Oj. 

Definition 1. The canonical form A-*- of an integral matrix A is the smallest 
kxn integral matrix, with respect to the number of rows k, such that the sets of 
nonnegative integral solutions {x G Zq \ Ax = 0} and {x G \ A-^x = 0} are 
equal, and each row Oj of A^ has the following properties: 
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1. aj=0 for j = 1,. . . ,i — 1, i.e., the coefficients below the main diagonal are 
equal to 0; 

2. a\ > 0, i.e., the main diagonal coefficients are positive; 

3. aj — 0 for j = i + 1, . . . ,k, i.e., the coefficients above the main diagonal are 
equal to 0; 

4 . gcd(a|, . . . , a") = 1, i.e., the greatest common divisor of the coefficients of a, 
is equal to 1; 

5. either there exists a negative coefficient < 0 or all coefficients a\ are equal 
to 0, where j € {A: + 1, . . . , n}. 

Hence, the canonical matrix A-^ has the form {Uk A^-k)> where Uk is a positive 

integral diagonal k x k matrix. 

The canonical form A-*- resembles to the Smith normal form of an integral ma- 
trix A. It can be constructed by the following algorithm. 

Algorithm A 

Input: Integral matrix A. 

Output: Canonical form A-*- of A. 

Method: Perform the following rules, with the precedence Combine >- Zero >- 
Negative V Gcd >- Exchange )- Below !>- Above !»• Separate on a 
given integral matrix A, while one of the conditions is satisfied. 

Combine: If there exists a row a, in A that can be written as a linear com- 
bination with rational coefficients of the other rows in A, then remove the 
row a, from A. 

Zero: Remove each all-zero row Oj = (0, . . . , 0) from A. 

Negative: If there exists a row Oj in A and a positive integer m, such that 
al = • • • = = 0 and a™ < 0, then replace the row a, in A by the new 

row a[ — —ai. This means that we multiply the row Oj by the coefficient —1. 

Gcd: If there exists a row Oj in A, such that gcd(a ],... ,aj*) > 1, then replace 
the row a, in A by the new row 6j, where we set bil — a\/ gcd{a\, ... ,a"). 
This rule forces the greatest common divisor of a row to be equal to 1. 

Exchange: If there exists two rows a< and Oj in A, where i < j, and two 
positive integers m,, mj, such that mi > mj, ^ 0, ^ 0, a' = 0 for 

alH = 1, . . . , ruj — 1, and a^ = 0 for all p = 1, . . . , mj — 1, then exchange 
the rows a, and aj in A. 

Below: If there exist two rows Oj and aj in A, where i < j, and a positive 
integer m, such that a-" 0, aJ* 0, and a* = 0 for allZ = 1, . . . , m - 1, 

then replace the row aj in A by the new row o'- = a-"flj — a^Oj. This rule 
forces the coefficients below the main diagonal of A to be equal to 0. 

Above: If there exists two rows a, and aj in A, where i < j, and a positive 
integer m, such that o-" 0, aJ* 0, and Oj = 0 for alH = 1, ..., m - 1, 

then replace the row a, in A by the new row a\ = - a^Oj. This rule 

forces the coefficients above the main diagonal of A to be equal to 0. 
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Separate: If there exists a row ai with the coefficients a| > 0 for each j = 
1, . . . , n, then add for each positive coefficient aj>0 the row = (e^, . . . , e^) 
to A, where ef = 1 if i = j and ej — 0 otherwise. This transformation cor- 
responds to the idea that a row a* with nonnegative coefficients forces the 
variables Xj in the system Ax = 0 to be assigned the value xj = 0 if the 
coefficient aj is positive, since we consider systems over nonnegative integers. 
End of algorithm 

Algorithm A deletes redundancies in an integral matrix. It is clear that the 
systems S: Ax — 0 and S': A'x — 0 over nonnegative integers have the same 
set of solutions if A' can be constructed from A by a successive application 
of the rules from the algorithm A. Unrestricted apphcation of the rnles from 
Algorithm A may result in exponentially big intermediate coefficients of the 
constructed matrix, even if the resulting canonical form A-^ is polynomial. To 
avoid this problem, we must apply the rules Above, Below, and Exchange in 
the same way as it was proposed by Kannan and Bachem in [KB79]. This method 
consist of computing the normal form Aj- of the first i rows before treating the 
(i -I- l)-th row of the matrix A. Under these circumstances, the Algorithm A runs 
in polynomial time and the intermediate coefficients are of polynomial size. 

Lemma 1. The algorithm A always terminates and computes in polynomial 
time for each integral matrix A the unique canonical matrix A-^ . 

We define two integral matrices A and B to be equivalent, if their canonical 
forms A-*- and are equal. In the same spirit, we define two systems 5 : Ax = 0 
and S' : Bx = 0 to be equivalent if their matrices A and B are equivalent and 
they have a nontrivial solution. The following proposition shows that there is 
a one-to-one correspondence between equivalent systems and nonempty Hilbert 
bases. 

Proposition 1. Let S: Ax = 0 and S': Bx = 0 6e two homogeneous linear 
Diophantine systems over nonnegative integers with nonempty Hilbert bases. The 
systems S and S' are equivalent if and only if they have the same Hilbert basis 
H{S)^H{S'). 

Proof. (Hint) The only-if direction is clear. Two equivalent systems S and S' 
have the same set of solutions and, consecutively, also the same Hilbert basis. 
For the if direction, assume that S and S' axe not equivalent, but both have the 
same nonempty Hilbert basis. We prove by means of the Fundamental Theorem 
of Linear Inequalities that the Hilbert bases H{S) and H{S') must be different, 
what entails a contradiction. □ 

Given a set of nonnegative vectors C — {ci, . . . , c^}, we need to reconstruct 
a homogeneous linear Diophantine system S' : Bx = 0 over nonnegative integers, 
such that each vector Cj € C is a solution of S'. The system S' is constructed in 
the following way. 

Let d be the dimension of the vectors C — {ci, . . . ,Cm}- Start with S' = 0. 
First of all, we must look for the coordinates that are equal to zero for each 
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vector Ci € C, i = 1,... ,m. For each coordinate j € {1,... ,d}, such that 
c- = 0 for all vectors Cj € C, put the equation xj = 0 into S'. This creates 
the rows with one coefficient equal to 1 and the other equal to zero in the ma- 
trix B. Now, form the equation E{x,y): xiyi • -I- Xdyd = 0. Substitute into 
E{x, y) consecutively the vectors Cj = (cf, ... ,cf) from C for the variable vector 
X = (xi,... ,Xd), forming the equations E{c\,y), E{c2,y), , E{cm,y). This 

creates a new homogeneous linear Diophcintine system S" : Dy — 0 over integers. 
Solve the system 5": Dy 0 by known methods from linear algebra, e.g., by 
computing the Smith normal form of the matrix D. If the system S" has no 
solution then there is no system S' with solutions including the set of vectors C. 
Let Y{p\, . . . ,pq) — {j/j = li(pi, . . . ,Pq) I i = 1, . . . , d} be the parametric solu- 
tion of the system S" with the parameters p = (pi, • • . ,Pq), where li are linear 
Diophantine expressions over p. We substitute consecutively the orthonormal 
basis {(1, 0, . . . , 0), (0, 1, 0, . . . , 0), . . . , (0, . . . ,0, 1)} for p into the parametric 
solution y(pi,... ,Pq), producing the particular solutions Yi = y(l,0, ... ,0), 
Y2 = y(0, 1, 0, . . . ,0), ... , y, = y(0, . . . , o, l) of the system S". Clearly, each 
solution of S" can be written as a linecur combination with integer coefficients of 
the solution Vj, . . . , V,. Now, we substitute consecutively the solutions Y\, ... , 
Yq into the equation E{x,y) for the variables y. We add the equation E{x,Yj) 
to the constructed system S', for each j = 1,... ,q. This terminates the con- 
struction of the system S'. The system S' can be constructed in polynomial 
time, because we can find the parametric solution Y (p) of the homogeneous lin- 
ear Diophantine system S" over integers in polynomial time. This is based on 
the fact that the Smith normal form of an integer matrix can be computed in 
polynomial time. 

Example 1. Let {100100, 010010, 001010, 100020, 020100, 011100, 002100} 
be the set of vectors C for which we wamt to reconstruct a homogeneous linear 
Diophantine system S' : Bx = 0 over nonnegative integers. The 6th coordinate 
of the vectors C is always equal to 0, therefore we set xe = 0. Form the equation 
E{x, y) : xij/i -|- X2P2 + a^32/3 + -t- xePe = 0. Substitute consecutively 

the vectors C for the variable x into E{x,y), forming the equations E(ci,y), 

. .. , E{c7,y). This results in the homogeneous linear Diophantine system S" 

yi + P4 =0 P2 +1/5 = 0 

1/3 +ys = 0 yi +2ys = 0 

2p2 + P4 =0 P2 + P3 + P4 =0 

2p3 +P4 =0 

over integers. A parametric solution of the system S' is the set 



Y (p) = {pi = 2p, p2 =P, ys= P, P4 = -2p, P5 = -p}- 



Instantiating p = 1 and adding the set of equations y(l) to the reconstructed 
system results in the following final system 



S' = { 2 xi -I- X2 + X3 - 2x4 - X5 = 0, xe = 0}. 
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It can be easily seen that each solution of the system S' is a linear combination 
with positive integer coefficients of the vectors C. In fact, this is not a surprise, 
since the set C is the Hilbert basis of the reconstructed system S'. 

Suppose now that C is the Hilbert basis of an unknown system. Following the 
Farkas-Minkowski-Weyl theorem (see Corollaries 7.1a and 7.1b in [Sch86]), the 
polyhedron P = {x \ Bx < 0} associated with the reconstructed system S' 
is equal to the set of nonnegative integrad vectors 

cone{G) = {Ai^i + 1- XtQt G (Zj )‘^ | 5i £ G, Xj € Qj } 

formed as linear combinations of nonnegative integer vectors G = {pi, • • • ,9t} 
with nonnegative rational coefficients Xj, called also the smallest convex cone 
generated by nonnegative integer vectors G. Since each vector from (7 is a solu- 
tion of the reconstructed system S' : Bx = 0, the set C must be a subset of the 
polyhedron P. Since P = cone{G), each vector c, e C can be written as a linear 
combination with nonnegative rational coefficients of the vectors G. The cone 
cone{G) can be seen also as a linear combination with rational coefficients of the 

vectors C, i.e., cone{G) = {pici -I 1- fimCm £ (Zq )‘^ | Ci £ C, fij £ Q}. If we 

know that the set of vectors C is a Hilbert basis of an unknown homogeneous 
linear Diophantine system over nonnegative integers, then it must also generate 
the cone cone(G) and henceforth the polyhedron P. Therefore, we have that 
t = m and G = C, since the Hilbert basis is unique. Therefore C must be also 
the Hilbert basis of the reconstructed system S': Bx = 0. Combining it with 
Proposition 1, we obtain the following result. 

Proposition 2. Let S be a homogeneous linear Diophantine system over non- 
negative integers. Let C he the Hilbert basis of an unknown homogeneous linear 
Diophantine system over Zj . The set C is the Hilbert basis of the system S if 
and only if the system S' reconstructed from C is equivalent to S. 

We are able now to prove the complexity result concerning the HILBERT basis 
RECOGNITION problem. 

Theorem 4. HILBERT BASIS RECOGNITION is strongly coNP -complete. 

5 Concluding Remarks 

We showed that the problem, given a homogeneous linear Diophantine system S 
over nonnegative integers, asking whether a given solution s belongs to the 
Hilbert basis of 5, is coNP-complete in the strong sense, but it has a pseu- 
dopolynomial algorithm if the number of equations in the system S is fixed. 
This sharpens previous complexity results on recognizing Hilbert basis vectors 
by Sebo [Seb90] and by Henk and Weismantel [HW96]. 

We also proved that the problem, given a homogeneous linear Diophantine 
system S and a set of vectors C, asking whether C is the Hilbert basis of S, is 
coNP-complete in the strong sense. This result answers an open problem stated 
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by Henk and Weismantel in [HW96]. Moreover, the aforementioned problem rep- 
resents an intermediate step for proving the lower bound of the problem whether 
a given set of integral vectors constitutes the Hilbert basis of an unknown ho- 
mogeneous linear Diophantine system over nonnegative integers. Essentially the 
same problem was stated by Edmonds and Giles in [EG82], where they showed 
that it belongs to coNP, but the lower bound was unknown. We proved that 
the problem is coNP-complete in the strong sense, what completely settles the 
complexity of the considered problem. 
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Abstract. We undertake a thorough complexity study of the follow- 
ing fundamental optimization problem, known as the ^p-norm shortest 
extended GCD multiplier problem: 

given ai,... ,a„ € Z, find an ip -norm shortest gcd multiplier for 
ai, . . . , On, i.e., a vector x G Z" with minimum sat- 
isfying = gcd(ai, . . . ,a„). 

First, we prove that the shortest GCD multiplier problem (in its feasi- 
bility recognition form) is NP-complete for every fp-norm with p G N. 
This gives an afliirmative answer to a conjecture raised by Havas and 
Majewski. We then strengthen this negative result by ruling out even 
polynomial-time algorithms which only approximate an £p-norm short- 
est gcd multiplier within a factor for 7 an arbitrary small 

positive constant, under the widely accepted complexity theory assump>- 
tion NP 2 DTIME(nP°‘>'(‘°s'*)). 

For positive results we focus on the fy-norm GCD multiplier problem. 
We show that approximating this problem within a factor of \/n is 
very unlikely NP-hard by placing it in NP PI coAM through a simple 
constant-round interactive proof system. This result is complemented by 
a polynomial-time algorithm which computes an fj-norm shortest gcd 
multiplier up to a factor of 

This study is motivated by the importance of extended gcd calculations 
in applications in computational algebra and number theory. Our results 
rest upon the close connection between the hardness of approximation 
and the theory of interactive proof systems. 
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1 Introduction 

Extended gcd computation is of particular interest in number theory and in 
computational linear algebra, in both of which it takes a basic role in funda- 
mental algorithms (see [14] for some references). The gcd problem for more than 
two numbers is interesting in its own right. Furthermore, it has important ap- 
plications in its extended form where corresponding multipliers are required, 
for example in computing canonical normal forms of integer matrices ([13, 10, 
15]). Of particular value are methods which find good solutions to the extended 
gcd problem with respect to some measure. Optimization problems for extended 
GCD calculation with respect to the fo-metric and the max- norm (foo) have 
been proved NP-complete [19] and approximation for these measures is handled 
in [21]. Interestingly, the fo-metric problem is solvable in average polynomial 
time [12]. 

It is widely believed that some NP-optimization problems cannot be solved 
efficiently, i.e., in time polynomial in the input length of the problem. However, in 
many practical applications, including the extended gcd problem, approximate 
solutions of such problems do suffice. Thus, there has been much work done in 
studying the complexity of finding approximate solutions for NP-optimization 
problems. 

It is desirable to have approximation algorithms such that the value of the 
returned solution is within a small feictor of the optimmn solution of the prob- 
lem. For minimization problems, the worst-case ratio of the value of the solution 
returned by the approximation algorithm to the optimum solution is called the 
approximation factor of the approximation algorithm. It is useful to understand 
the quality of an approximation algorithm relative to the best that can be ex- 
pected to be aveulable in polynomial time. Thus, we may want to guarantee that 
a constructed approximation algorithm is best possible in that no substantially 
better approximation factor can be achieved, unless certain complexity theoret- 
ical assumptions are wrong. 

In this context it is natural to study the complexity of the following opti- 
mization problem: 

Shortest GCD Multiplier in ^p-norm (SGCDMp) 

INSTANCE; n numbers oi, . . . ,a„ G Z 

SOLUTION: A vector x G Z" such that X)r=i — gcd(ai, . . . , a„) 

MEASURE: The fp-norm ||x||p := (^i<j<„ of the vector x 

Our strongest result states that, unless NP C DTIME(nP°*y^'°®"^), SGCDMp 
cannot be approximated within a factor "), where 7 is an arbitrary small 

positive constant. To prove this inapproximability result we start by constructing 
a gap-preserving reduction from the Min Total Label Cover problem in 
^i-norm to the SGCDMi problem. In contrast we show that, for the ^2-norm, 
approximation within a factor of ^/n is very imlikely NP-hard using an interactive 
proof system. We conclude by aneilyzing the approximation quality of an existing 
algorithm. 
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2 Preliminaries 

Our proofs closely follow ideas in [1, 21]. First, we briefly introduce some notation 
(see [3]). 

Definition 1. For an input / of a (positive- valued) minimization problem II 
whose optimal solution has value optfj{I), an algorithm A (which produces a 
solution with value A{I)) is said to approximate optjj{I) within a factor f{I) iff 
optn(I) < A{I) < f(I) optn(I), where /(/) > 1 . 

Definition 2 . Let 77 and II' be two minimization problems and g, g' > 1 - 
A gap-preserving reduction from 77 to 77' with parameters {{c, g), {c' , g')) is a 
polynomial-time transformation r mapping every instance 7 of 77 to an instance 
7' = r(7) of 77' such that for the optima optn{I) and optn’{I') of 7 and 7', 
respectively, the following hold: 

optn(I) < c => optjj'(I') < c' and 
optn(I) > c g^ optn'(I') > c' ■ g' , 

where c,g and c',g' depend on the instance sizes |7| and |7'|, respectively. 

3 Hardness of Shortest Z-Solution of Linear System 

To prove our inapproximability result we construct a gap-preserving reduction 
from the Min Total Label Cover problem in 7i-norm to the SGCDMi 
problem via the problems Shortest Z-Solution of Linear System in 71- 
norm and Shortest Diophantine Equation Solution in 7i-norm. 

3.1 The Min Total Label Cover Problem 

In the following G — {Vi,V 2 ,E) denotes a bipartite graph, B a set of labels 
for the vertices in V) U V 25 and for every e € E there exists a partial function 
Pe : B B describing the admissible pairs of labels. We adapt the notation of 
[ 2 , 1 ]. 

Definition 3. A labeling of G — {Vi,V 2 ,E) is a pair {Vi,'P 2 ) of functions Vi : 
Vi — >■ 2®, i = 1, 2, which assign to each vertex in V) U V 2 a possibly empty set of 
labels. 

Definition 4. Let (Pi, P 2 ) be a labehng of G = (Vi, V 2 , E) and let e = (t;i, ^ 2 ), 
t^i e Vi, i ;2 £ V 2 , be an edge of G. We call e = (^ 1 ,^ 2 ) covered iff Pi(^^i) 7 ^ 0, 
V 2 {v 2 ) # 0 and for all labels 62 G ^ 2 (^ 2 ) there exists a label bi € V\{v\) such 
that Pe{bi) — 62 - A labeling (^ 1 ,^ 2 ) of G = (Fi, V 2 , E) is called a total-cover of 
G iff every edge of G is covered by the labeling (Pi,P 2 )- 

Definition 5. The ii-cost of a labeling (Vi,V 2 ) for a graph G = (Vi,V 2 ,E) is 
defined as cost{Vi,V 2 ) = \Vi{vj)\. 
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Definition 6. Min Total Label Cover in ^i-norm (MinTLCi) 

INSTANCE: A d-regular bipartite graph G = {y\,V 2 ,E), a set of labels B — 
, A/”}, AA € N+, and for every edge e € E a partial function pe : B B 
such that ^ 0 for the distinguished label 1 6 B 

SOLUTION: A total-cover (Pi, 7 ^ 2 ) of G 
MEASURE: The ^i-cost cost{Vi,V 2 ) of the total-cover (Pi,P 2 ) 

Remark 1. We can always ensure the existence of a total-cover with ^i-cost at 
most (|Vi| -|- 1)A/’; we simply let Vi{vi) — B for all v\ C Vi and ^ 2 ( 1 ^ 2 ) = {1} for 
all U 2 € V 2 - 

The Min Total Label Cover in ^i-norm is explicitly due to Khanna, 
Sudan and Trevisan [17] and a similar form of the following Lemma is imphcitly 
proved in Limd and Yaimakakis [18]. 

Lemma 1. 

1. For every constant g >1 there exists a polynomial-time transformation t 
from 3-Sat to Min Total Label Cover such that, for all instances I: 

I e 3-Sat => 3 total-cover (Pi,P 2 ) ofr(7) : cost{V\,V 2 ) = 1 • (]Vi| -I- IY 2 I) 

I ^ 3-Sat =» V total-cover (Pi,P 2 ) of r(/) : cost{Vi,'P 2 ) > 9 • (jl^il + ]Y 2 |). 

2. There exists a quasi-polynomial-time, i.e., DTIME(nP°'y('°®"^), transfor- 
mation T from 3-Sat to Min Total Label Cover such that, for all instances 
I: 

I G 3-Sat 3 total-cover (Pi,p 2 ) of r(7) : cost{Vi,p 2 ) = 1 • (jVi] + 1 ^ 2 ]) 

7 ^ 3-Sat V total-cover (Pi,P 2 ) of r(7) : cost{Vi,V 2 ) > 9 • (]Vi] -|- ]V 2 |)j 

where g = 2 *°®^ ^ with 7 an arbitrary small positive constant. 

Proof. Both statements 1. and 2. are simultaneously proved by combining ideas 
of Arora and Lund [2] and Lund and YEumaikakis [18] with the recent result of 
Raz [20]. The proof is deferred to the full version. □ 

3.2 Shortest Z-Solution of Linear System 

Shortest Z-Solution of Linear System in 7i-norm (SZSLSi) 

INSTANCE: A linear system Ax = b of m equations in n variables where A is a 
rational mx n matrix and b an m-dimensional rational vector 
SOLUTION: A nonzero vector x G Z" satisfying Ax = b 
MEASURE: The 7i-norm ]|x]]i := 13i<t<n l^«t vector x 

Theorem 1. There exists a polynomial-time transformation t from MiN Total 
Label Cover to Shortest Z- Solution of Linear System such that, for 
all instances I and for all g > 1: 

OptMinTLCi (7) — 1 • (1^1 1 + 1 ^ 2 !) => OptsZSLSi(T(7)) - 1 • (jVi] -|- [Y 2 I) 
OpfMinTLCi (A) > 9 ■ (j^lj + |1^2|) PptsZSLSi (t(7)) > g ■ (]Vi] -|- jV^I). 
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Proof. Prom a given Min Total Label Cover instance / = {Vi,V 2 , E,p,B,J^ 
we construct a linear system of equations Ax = b with A a.n m x n ma- 
trix of entries {0, 1}, b an m-dimensional all-one vector, m = l-BKA/" -I- 1) and 
n=\Vi\M+\V 2 \Ar. 

For every pair {v, b) with u € Pi U V 2 and b € B we define a column vector 
^v,b 6 {0, 1}"* of A as follows. The 1) coordinates of a„,i, are split into 

|B| blocks of e-projections Ue(a„,b) — one {Af l)-length block for every edge 
e G In particular, we define for every (v 2 ,b 2 ) &V 2 X B 



Ue(^t)2,(>2 




Ofej iff e is incident to V 2 
0 otherwise 



and for every (vi,bi) e Vi x B 

, / f fff ® i® incident to vi and Pe{b\) 0 

otherwise 

where 6j, j = 1, . . . ,.A/', denotes the j‘**-unit vector and 0, 1 the all-zero, all-one 
vector in , respectively. 

Next, we define the right-hand side of our linear system — the vector b — 
as the vector having 1 in each of its \E\{Af + 1) coordinates. 

Now let X = ^Xv,b^v,b be an integer linear combination of the column 
vectors a„,(,. Then, assigning every vertex v a label b iff x^,b # 0 defines a labeling 
(Pf , P* ) induced by the vector x. Prom [1, Corollary 12] it follows that any such 
X induces a total-cover of (Pi, P2, E). Thus, any solution x € IA/‘+|V 2 |a/’ qJ 

linear system Ax = b induces a total-cover of (Pi,P2,£^). Therefore, we deduce 
that for any solution x G Z” of Ax = b we must have ||x||i > optMinTLCi(-f)- 

On the other hand, assume now that optMinTLCi (f ) == (|Pi| + IV 2 I) and let 
(Pi)P2) denote the corresponding labeling. Then, the vector x given by 



^Vj ,Vi{vj) • — 1 ^Vj G Vi, i — 1,2 

Xvj,b ■= 0 Vuj G Pi, V6 G B \ Vi{vj), i = 1,2 

is a feasible solution of the linear system Ax = b satisfying ||x||i = (|Pi|-l-|k 2 |). 

The reduction from the given instance I of Min Total Label Cover to 
the above constructed linear system Ax = b is feasible in time polynomial in the 
dimension of A which in turn is polynomial in |J|. Clearly, the above reduction, 
say r, is gap-reserving with parameters ((|Pi| -h iPzLs), (|Pi| + IPzIjS))- □ 



4 Hardness of Approximating SGCDMi 

4.1 A new Aggregation Lemma 

The following Lemma establishes for the first time a polynomial-time reduction 
from a system of inhomogeneous linear equations to a single equation with iden- 
tical ^p-bounded solution set. Previously, it was only known to hold by a result 
of Kannan [16] for the case of identical ^oo-bounded solution sets. 
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Lemma 2. Letp, q e NU{oo} with 1/p+l/g — 1, A be an integralmxn matrix, 
Halloo,, := maxi<i<m(X)i<j<n ^ integral m-dimensional 

vector. Then 

f m — In — 1 m — 1 

n {x 6 Z" : Ax = b} = n < X G Z” : ^ ^ k'aijXj = ^ k'bi 

^ 1=0 j =0 i=0 

where denotes the n-dimensional ball of ip-radius fi centered at the origin 
and k > ||A||oo_g/i + ||b||oo + 1 an integer. 

Proof. Denote the two sets by Sm and respectively. Clearly, Sm C 5i. To 
prove the reverse inclusion, suppose that there exists an element x € 5i not 
satisfying at least one equation of Ax = b. Let A [ai,... ,am]^ and let 
fmax denote the largest index for which (ai,x) ^ bi. As |lx||p < ^ we have 
(using Holder’s inequality) |(ai,x) - 6j| < ||A||oo,q/i + ||b||oo < fe — 1 and since 
X e 5i we must have ~ ^«) = 0- By definition of fmax this 

yields A:*((ai,x) - hi) = -**‘"“((ai„.,,x) - hi^^J with a nonzero right- 

hand side implying that the left-hand side is also nonzero. Now the left-hand 
side is both a multiple of A:*”** and in absolute value bounded by - 1, a 
contradiction. □ 

4.2 Hardness of Approximating Shortest Diophantine Equation 
Solution or Aggregation Part I 

Shortest Diophantine Equation Solution in fi-norm (SDESi) 

INSTANCE: An equation xiOi H h x„a„ = b with oi, . . . ,a„, 6 G Z 

SOLUTION: A vector x G Z" such that 5Zr=i ~ ^ 

MEASURE: The ^i-norm ||x||i := X3i<i<n l^»l vector x 

Theorem 2. There exists a polynomial-time transformation t from (a restricted 
subset of) Shortest Z-Solution of Linear System to Shortest Diophan- 
tine Equation Solution such that: 

1. for all instances I and for all c,g> 1; 

OptsZSLSi (!) — C => OptsDESi (t(/)) = C 
OptsZSLSi {!) > C- g OptsDESi ('^W) > C • 5 

2. the constructed instance a'^x\ h aj,x„ — b' of SDESi satisfies 

xa'j, = b' for some j* G {1, • • • ,n} and some x G Z. 

Proof. Consider the linear system Ax = b constructed in the reduction of 
the proof of Theorem 1. Recall that for the underlying d-regular graph G = 
(^ 1 ) Vi, E) with label set B, \B\ = A/”, the matrix A is a m x n matrix of entries 
{0, 1} with m = lEKA/"-!- 1) and n = |Fi|A/’-l- IV^IA/", and that the vector b has 
only 1-entries in its coordinates. 
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Let aj* := Siv2,b2 for some e V2, b2 & B he a nonzero column vector of 
A. Since the graph G is d-regular, has exactly d 1-entries. Let a be the 
permutation shifting all 1-entries of aj- to its first d coordinates and note that 
cr(b) - b. 

Aggregating now the (solution equivalent) linear system cr(A)x = b via 

|Vi|Ar+|V2|.v-i 

Lemma 2 yields the inhomogenous Diophantine equation ^ a'jXj = b' 

i=o 



\E\{Af+l)-l |B|(.Ar+l)-l 

with a'j := ^ and b' := ^ k'bi. In particular, we have 

1=0 i=0 



d-l 



k‘‘-l 



|£|(jV-t-l)-l 



a'. = and ft' - ^ 



jfc* 



;^.|£;|(;yr+i) 

fc- 1 



and from the 



i=0 t=0 

regularity of G, i.e., d|V2| = \E\, it follows that — 1) divides (fcl^K-^+r) — l), 
hence o' , divides b'. 

Thus, we have constructed a gap-preserving reduction from the linear system 

Ax = b to an inhomogenous Diophantine equation instance a[xi i-a'^Xn = 

b' with parameters {(c,g),{c,g)) such that there is an index j* 6 {1, ... ,n} 
satisfying a'-. 1 6'. □ 



4.3 The Final Reduction or Aggregation Part II 

The basic construction The Min Total Label Cover problem in £i-norm 
is not approximable within a factor where 7 is an arbitrary small 

positive constant, unless NP C Prom the gap-preserving 

reductions we have that the same inapproximability factor holds for the problems 
SZSLSi and SDESj. 

Theorem 3. There exists a polynomial-time transformation t from (a restricted 
subset of) SDESi to SGCDMi such that, for all instances I and for all c,g > 1; 

OptsDESi(I) = C OptsGCDMi(T(I)) — C 1 
OptsDESi (/) > c • 5 => OptsGCDMi (t(/)) > C ■ g + 1. 

Proof. We start with the instance a\x\-\ = b' of SDESi constructed 

in the above Theorem 2 and consider for an mbitrary integer c e Z \ {0} the 
linear system 

o!\Xl H 1- a'nXn - b'Xn+1 = 0 

which forces the variable Xn+i to take on the value 1. Now, we fix 5 > 1 and apply 
Lemma 2 with suitable k to this linear system, obtaining the single equation 



ka'-^xi -I- • ■ • -f ka'„x„ -t- (c - kb')x„^i = c. 

We observe that the right-hand side c in the last equation was an arbitrarily 
chosen integer and that b' satisfies (by Theorem 2) 
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xa'i. = b' for some i* G {1, . . . , n} and some x G Z . (*) 

This will give us the desired gap-preserving reduction t. Namely, we choose 
c = gcd(a'i, • • • By (+) this implies 

gcd(fcai,--- ,ka'„,{c- kb')) 

= gcd(A:a'i , • • ■ , . . . ,fca^,gcd(fca'j., (c - fcxa^.))) 

= c. 

Since the variable x„+i is forced to take on the value 1, it is obvious from the 
above construction that the reduction r with t{I) = {ka'i,- ■ ■ , A:aJj, (c — Aih')} 
satisfies the claim of the Theorem. □ 

Treating other norms Close inspection of the whole proof shows that it also 
implies the following general result. 

Theorem 4. There exists a polynomial-time transformation r from SDESi to 
SGCDMp such that, for all instances I and for all c,g> 1: 

OptsDESii^) = C => OptsGCDMp(T(/)) = ^ C 1 

optsDESi(I) >c-g=i^ optsGCDMp(T(/)) > ^c-g-\-l. 

4.4 Implications on the complexity of SGCDMp 
We are now ready to present our main result. 

Theorem 5. 

1. Unless NP C P, there exists no polynomial-time algorithm which approxi- 
mates the Shortest GCD Multiplier problem in £p-norm within a factor of 
k, where k >1 is an arbitrary constant. 

2. Unless NP C DTIME(nP°'y('° 8 ")), there exists no polynomial-time algo- 

rithm which approximates the Shortest GCD Multiplier problem in £p-norm 
within a factor for 7 an arbitrary small positive constant. 

Proof. Combining the gap-preserving reductions of Lemma 1 (first statement), 
Theorem 1, Theorem 2 and Theorem 4 gives for all instances I: 

I G 3-SAT => OptsGCDMp(T(/)) = ^\Vi\ -I- IV 2 I -I- 1 

I i 3-Sat ^ optsGCDMp(r(/)) > ^5(|Vl| + |y 2 |) + l, 

where 5 > 1 can be any positive const^lnt. Therefore, given a polynomial-time 
algorithm approximating the SGCDMp problem within a factor of A: = ^ 
would enable us to decide 3 -Sat in polynomial-time, thus proving the first claim. 

The second claim follows similarly, via the second statement of Lemma 1 
instead of its first statement. □ 

Corollary 1. For every £p-norm, the Extended GCD problem is UP-complete 
(in its feasibility recognition form). 
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5 Limits on the Hardness of Approximating SGCDM 2 

Theorem 6. The y^n/0(log n)-SHORTEST GCD Multiplier in i 2 -norm is 
not NP -hard unless the Polynomial-Time Hierarchy collapses to its second level. 

Proof. We only give a brief outline of the proof, which uses results from Babai 
[4], Boppana et al. [6], Goldreich and Goldwasser [7], Goldwasser et al. [8] and 
Goldwasser and Sipser [9]. 

Along the thread of the above Theorem and Boppana et al. [6] we have 
to show a constant-round interactive proof system for C05(n)-SGCDM2 with 
g = ^/nJO(\ognj. Notice that the input to co5(n)-SGCDM2 is a tuple (a, 1) with 
a e Z" and I € M and we have to show that yes-instances (i.e., optsGCDM2(a) > 
g ■ 1) are always accepted, whereas no-instances (i.e., optsGCDM2 ^ 0 
accepted with probability bounded away from 1. The following simple interactive 
proof system establishes all that we need for an input tuple (a, 1): 



V(a,0 P(a,0 

compute basis [bj, . . . ,b„_i] of lattice IT D aR-^ 

compute V € Z” with (v, a) = gcd(a) 

let M := max{||bi||, . . . , ||b„_i||, ||v||} 

pick r €r {p € L{B) : ||p|l2 < 2" • M} 

pick rj €r {p € R" : ||p||2 < 

pick a €r {0, 1} 

X := r -t- CTV -I- Tj 

V.x 

P computes r := ( ? 

1 otherwise 

^ T 

accept iff r = cr 



Following Goldreich and Goldwasser [7], the claimed properties of the above 
protocol axe witnessed below by two Lemmas, whose easy proofs are omitted. 

Lemma 3. If optsGCDM2 > 9 ' ^ then there exists a prover P such that V 
accepts with probability 1 when interacting with P. 

Lemma 4. Let c > 0 and g{n) > yJnJ{c\ogn), if optsGCDM2(®) < t then, 
for any P interacting with V, the verifier V accepts with probability at most 
1 - l/n2<=. 

Finally, transforming the above protocol with the methods of Babai [4] and 
G oldwasser a nd Sipser [9] into an AM proof system yields for the gap g = 
y/n/0{logn) the inclusion y(n)-SGCDM2 G NP HcoAM. The rest of the proof 
is straightforward. □ 
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6 A LLL-based approximation algorithm 

Here we give an overview of a polynomial time approximation algorithm for the 
f 2 -norm. This and related algorithms are studied in [14]: 



compute basis [bi, . . . , b„_i] of lattice Z" D aR-*- 
compute multiplier v € Z" with (v, a) = gcd(a) 
apply the LLL-reduction to the basis [bi, . . . , b„_i] 
compute x' e L{B) with ||x' - v|| < minu6L(B) ||u — v|| 

by applying Bahai’s nearest pleme algorithm to the reduced basis 
and the target vector v 
output short multiplier x := v — x' 



Theorem 7. On inputs ai , . . . , a„ G Z the above algorithm computes in poly- 
nomial time a vector x with {x, a) = gcd(a) satisfying 

||x|| < • OptsGCDM 2 (nii • • • >nn)- 



Proof. The lattice Z" fl aR-‘- constitutes an n — 1 dimensional lattice in Z" . As 
shown by Babai [5], the nearest plane heuristic run on the LLL-reduced basis 
of [bi, . . . , b„_i] and the target vector v delivers a vector x' 6 L{B) satisfying 
||x' - v|| < niinugi,(B) ||u - v||. The Theorem follows. □ 



7 Conclusions 

We have shown that the shortest GCD multiplier problem (in its feasibility 
recognition form) is NP-complete for every fp-norm with p 6 N. Furthermore, 
approximating an ^p-norm shortest GCD multipher within a factor 
for 7 an arbitrary small positive constant is quasi-NP-hard. 

On the positive side, we specialize to the ^ 2 -norm. Approximating the so- 
lution to the ^ 2 -norm problem within a fcictor of y/n/ -^O(logn) is very un- 
likely NP-hard. We see that there is a trainsition in theoretical complexity at 
aroun d -y/n: wit hin a factor of is quasi-NP-hard; within a factor of 

■yn/ ^/0{logn) is unlikely to be NP-hard. 

The best known polynomial time algorithm for the ^ 2 -norm problem achieves 
a factor of 2^"“^^/^. It remains to be seen whether this substantial gap between 
around \/n and 2^"“^^/^ can be reduced or whether a LLL-based algorithm can 
be better analyzed in some circumstcmces. 
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Abstract. We study the complexity of the confluence problem for re- 
stricted kinds of semi-Thue systems, vector replacement systems and 
general trace rewriting systems. We prove that confluence for length- 
reducing semi-Thue systems is P-complete and that this complexity 
reduces to NC^ in the monadic case. For length-reducing vector re- 
placement systems we prove that the confluence problem is PSPACEl- 
complete and that the complexity reduces to NP and P for monadic sys- 
tems and special systems, respectively. Finally we prove that for special 
trace rewriting systems, confluence can be decided in polynomial time 
and that the extended word problem for special trace rewriting systems 
is undecidable. 



1 Introduction 

Rewriting systems that operate on different kinds of objects have received a lot 
of attention in computer science. Two of the most intensively studied types of 
rewriting systems are semi-Thue systems [B093], which operate on free monoids, 
and vector replacement systems (or equivalently Petri-nets), which operate on 
free commutative monoids. Both of these types of rewriting systems may be 
seen as special cases of trace rewriting systems [Die90]. Trace rewriting systems 
operate on free partially commutative monoids, which are in computer science 
better known as trace monoids. Trace monoids were introduced by [Maz77] into 
computer science as a model of concurrent systems. 

Confluence is a very desirable property for all kinds of rewriting systems 
since it implies that the order in which rewrite steps are performed is irrelevant. 
Several decidability and undecidability results are known for the confluence prob- 
lem for the different types of rewriting systems mentioned above; For length- 
reducing semi-Thue systems confluence can be decided in polynomial time, see 
e.g. [B093], Corollary 3.2.2. On the other hand there exists a trace monoid such 
that confluence is undecidable for length-reducing trace rewriting systems over 
this trace monoid [N088]. In [Loh98] this result was even sharpened. It was 
shown that unless the underlying treice monoid is free or free commutative, con- 
fluence is undecidable for length-reducing trace rewriting systems. Concerning 
vector replacement systems it was shown in [VRL98] that confluence is decidable 
but EXPSPACE-hard for the class of all vector replacement systems. 

In this paper we will continue the investigation of the confluence problem for 
different kinds of rewriting systems. In Section 3 we will prove that confluence for 
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length-reducing semi-Thue systems is not only solvable in polynomial time but 
furthermore P-complete, which roughly means that it is inherently sequential. 
On the other hand for the more restricted class of monadic semi-Thue systems 
(where monadic means that all right-hand sides consist of at most one symbol) 
there exists an efficient parallel algorithm that decides confluence. Concerning 
vector replacement systems we prove in Section 4 that for the length-reducing 
case, confluence is PSPACE-complete and that this complexity reduces for the 
monadic case and the special case (where special means that all right-hand sides 
are empty) to NP and P, respectively. Finally in Section 5 we prove that conflu- 
ence is decidable for special trace rewriting systems in polynomial time which 
solves a question from [Die90]. We end this paper by showing that in contrast 
to semi-Thue systems the extended word problem, see [B085], is undecidable 
even for special trace rewriting systems that contain only one rule. Proofs that 
are omitted in this paper can be found in the long version [Loh99]. 

2 Preliminaries 

In this section we introduce some notations that we will use in this paper. For 
an alphabet i7, E* denotes the set of cdl flnite words of elements of E. The 
empty word is denoted by 1. The length of the word s is denoted by |s|. As 
usual E~^ = I7*\{1} and X'" = {s 6 i?* | |s| = n}. The set of all letters 
that occur in the word s is denoted by alph{s). For a natural number n G N 
let ld(n) denote the logarithm of n to the base 2. Let hit{n) = [ld(n)J -I- 1 if 
n > 0 and bit(0) = 1, i.e., bit{n) is the length of the binary representation of 
n. For a vector n = (ni, . . . ,n*,) G N* let 6z<(n) = bit{rii) -t- • • • -t- bit{rik). We 
assume that the reader is famihar with the basic notions of complexity theory, in 
particular with the complexity classes P, NP, and PSPACE, see e.g. [Pap94]. Let 
us just briefly mention the deflnition of the parallel complexity class NC*' where 
A; > 1, see [GHR95] for more details. A language L C {a,b}* is in NC* if for 
every n > 1 there exists a Boolean circuit with n linearly ordered inputs that (i) 
can be calculated from n in deterministic logarithmic space, (ii) contains 
many gates of fan-in at most two, (iii) has depth 0{ld^{n)), and (iv) accepts 
the language L n {a, 6}", where a (6) corresponds to the truth value true (false). 

In the following we introduce some notions concerning trace theory, see 
[DR95] for more details. An independence alphabet {E,I) consists of a flnite 
alphabet E and an irreflexive and symmetric relation I C E x E, called an in- 
dependence relation. Given an independence alphabet {E, I) we define the trace 
monoid M{E,I) as the quotient monoid E*/=i, where =/ denotes the least 
equivalence relation that contains all pciirs of the form {sabt, sbat) for (a, b) & I 
and s, t G E*, which is a congruence on E*. An element of M(X, 7), i.e., an 
equivalence class of words, is called a trace. The trace that contains the word s 
is denoted by [s]/. The empty trace [1]/ will be also denoted by 1. Concatenation 
of traces is defined by [s]/[t]j = [st]/. Since for all words s,t E. E* , s =i t implies 
|s| = |t| and alph{s) — alph{t), we can define |(s]/| = |s| and aZp/i([s]/) = alph{s). 
We write u I v if alph{u) x alph{v) C I. For the rest of this section let (X, I) be 
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an arbitrary independence alphabet and let M = M(i7, 1). U I = {S x Z')\Id£', 
where Id^; = {(a, a) | a 6 E}, then M is isomorphic to the free commutative 
monoid over |I7| many generators and we identify traces from M with 
|27|-dimensional vectors over N. On the other hand if 7 = 0 then M is isomor- 
phic to the free monoid E*. The following lemma is a simple generalization of 
the well known Levi’s lemma for traces [CP 85], see [Loh99] for a proof of this 
generalization. 



Lemma 1. Let ui,U 2 ,U 3 ,i;i,U 2 ,U 3 G M. Then — V 1 V 2 V 3 iflF there exist 

Wij G M (1 < < 3) such that (i) = u^i,in^i, 2 ^<^t ,3 for 1 < i < 3, (ii) 

Vj = wijW 2 ,jW 3 j for 1 < J < 3, and (iii) Wij I Wk,i if i < k and I < j. 



The diagram on the right visuahzes the situation in the 
lemma. The i-th column represents the j'-th row rep- 
resents Vj, the intersection of the i-th column and the 
j-th row represents Wij , and Wij and Wk^i are indepen- 
dent if one of them is north-west of the other one. 



V 3 


Wl,3 


'^2,3 


W3,3 


V2 


^1,2 


U)2,2 


W3,2 




•Wl,l 


W2,l 


W3,l 




Ui 


U2 


U 3 



A trace rewriting system, briefly TRS, over the trace monoid M is a finite 
subset of M X M. In the rest of this section let TZhe a, TRS over M. If 7 = 0, 
i.e., M ~ E*, then 72. is also called a semi-Thue system, briefly STS, over E, see 
[B093] for more details on STSs. On the other hand if I = {E x 27)\Ids, i.e., 
M ~ then 72 is also called a vector replacement system, briefly VRS, in 
the dimension |i7|. Vector replacement systems aire easily seen to be equivalent 
to Petri-nets. An element (£,r) G 72 is also denoted by 7 -+ r. The set {7 | 
3r G M : (7, r) G 72} of all left-hand sides of 72 is denoted by dom{TZ). The set 
ran{TZ) of all right-hand sides of 72 is defined analogously. Given c = (7, r) G 72 
and s,t E M, we write s 7 if s = v£v and t — urv for some u,v E M. We 
write s t if s —^c t for some c E TZ. As usual, - 4 ^ is the transitive 

(reflexive and transitive) closure of cmd is the reflexive, transitive, and 
symmetric closure of — > 7 ^. The pair {u,v) E M x M is confluent (with respect 
to 72) if u — w and v w for some w E M. The TRS 72 is confluent on the 
trace u E M if for all ui, U 2 G M with u vi and u V 2 the pair (ui, U 2 ) is 
confluent. The TRS 72 is confluent if 72 is confluent on all u G M. The TRS 72 
is locally confluent if for all u,vi,V 2 G M with u and u V 2 the pair 

(vi,V 2 ) is confluent. The TRS 72 is terminating if there does not exist an infinite 
chain ui — U 2 U 3 —>n ■ • ■ . If 72 is terminating then by Newman’s lemma 
72 is confluent iff 72 is locally confluent. A trace u G M is irreducible (with respect 
to 72) if there does not exist a u G M with u v- The set of all u G M that are 
irreducible with respect to 72 is denoted by IRR(72). The trace u is a normalform 
of u if u — V and v E IRR(72). The TRS 72 is length-reducing if |7| > |r| for all 
(7, r) G 72. Obviously, if 72 is length reducing then 72 is terminating. The TRS 72 
is monadic if 72 is length-reducing and ran( 72) C { 1 } u E. The TRS 72 is special 
if ron(72) {1} and 1 ^ dom{Tl). Let COLR(M) (COMO(M), COSP(M)) 

denote the set of cdl confluent TRSs over M that are length-reducing (monadic, 
special). The uniform word problem for a class C of TRSs over M is the following 
decision problem: Given 72 G C and u,v E M, does u v hold? 
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Since we will investigate the complexity of algorithms that take a TRS as 
input, we have to define the length j7i| of the TRS 7^. If / ^ (i7 x I7)\Id^ then in 
geneicil the best possible coding of a rule from 7^ is to simply write down words 
over S that represent the left- and right-hand side of the rule. Thus in this case 
we define |7?.| = 53{(1^1 + Irj) j (£,r) € 72.}. But ii I = {E x i7)\Id£, i.e., if 72. is 
a VRS we can code 72 more efficiently by using the binary notation. Therefore 
in this case we define |72| = ^{bit{£) -I- bit{r) | (£,r) £ 72}. In this paper we 
always assume that a TRS 72 is represented as a string of length f2(|72|). 

3 Semi-Thue systems 

For terminating STSs confluence is decidable [B081]. This classical result is 
based on the critical pairs of a STS. Let 72 be a STS over E. The set of 
critical pairs CP(72) contains exactly all pairs of the form (i) (srit, 7 - 2 ) where 
€ 72 and (ii) (riu,sr 2 ) where {st,ri),{tu,r 2 ) 6 72 and t ^ 1. 
Note that CP (72) is finite. It is well known that 72 is locally confluent iff all crit- 
ical pairs are confluent [NB72], which can be decided in the terminating case. 
For length-reducing STSs, confluence Ccin be even decided in polynomial time 
[B081,KKMN85]. In this section we prove that COLR({a,6}*) is moreover P- 
complete. Under reasonable assumptions from complexity theory this roughly 
means that the problem COLR({a, 6}*) is inherently sequential. 

Theorem 1. COLR({o, 6}*) is P-complete. 

Proof. The following problem is known to be P-complete [GHR95]: Given a 
deterministic Turing-machine M, an input w for A4, and a word t £ {#}*, does 
M halt on w after < jtl steps? Let M = {Q, E,0,6,qo,qf) be a deterministic 
Turing-machine, w £ (Z'\{a})* be cin input for M, and m > 0. Here Q is the 
set of states, E is the tape alphabet (Q fl 17 = 0), □ £ i? is the blank symbol, 
^ = {Q\{Qf}) X 7[7 — ^ Q X i7 X {L,R} is the total transition function, go € Q is 
the initial state, and qf £ Q is the final state. M halts iff it reaches the final 
state qf. Let 17' = {a' | a € 17} be a disjoint copy of E with E' D Q = 0. 
Let > (left-end marker), < (right-end marker). A, and B be additional symbols 
and let n = 3(m -|- 1) -b |w| -t- 2. We define the length-reducing STS 72 over 
r = Q U 17 U 17' U {>, <, i4, 5} by the following rules, where a,b,c £ E, p,q £ Q, 
Q # qf- 



(la) 


qfX 


— > qf for all x £ 


r 


(2 


1 ) A’^B 


^ 3(mH-l) 




(lb) 


xqf 


—>■ qf for all X G 


r 


(2b) A.J3 — ^ Qf 






(3a) 


ag^ 


< ab'p^^'~^^< 


if 


%,□) 


= R), 


1 < 


i <m + l, 


a € r' U {>} 


(3b) 


ag^ 


a ->■ ab'p^^‘~^^ 


if 


6(q,a) 


= ip,b,R), 


1 < 


i <m+l, 


a e r' U {>} 


(3c) 


a'q^'< -4 i)a6< 


if 


<5(9, n) 


= (p,b,L), 


1 < 


i <m+l 




(3d) 


>q^ 


< — >• >p^(* ilnfx 


if 


S(q,0) 


= ip,b,L), 


1 < 


i <m+l 




(3e) 


(fq^'a -> ^^cb 


if 


S(q,a) 


= {p,b,L), 


1 < 


i < m + 1 




(3f) 


>g^‘ 


a —>■ >p^^‘ 


if 


S(q,a) 


= {p,b,L), 


1 < 


i <m + l 
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The rules (la) and (lb) make qf absorbing. The rules (3a) to (3f) simulate the 
machine Note that the state symbol is represented 3i times on the left- 
hand side and 3(i — 1) times on the right-hand side in order to malce these 
rules length-reducing. Furthermore since M. is deterministic, these rules do not 
generate critical pairs. The rule (2a) generates an initial configmation for A4. 
Since the initial state go is represented 3(m+l) times in the initial configuration, 
at most m + 1 steps of A4 will be simulated with the rules (3a) to (3f). Note that 
7Z can be computed from A4, w, and #”* in deterministic logarithmic space. For 
this it is necessciry that m is given in the unary representation since |7?.| 
increases exponentially with bit{m,). We claim that TZ is confluent iff M halts on 
w after < m steps. 

If M. does not halt on w after < m steps then by simulating m + 1 steps of M 
we obtain A'^B — >( 2 a) >u'v< £ IRR(72.) for some u,v € E*. 

Since also A''B ->( 26 ) ~^"i6) 9/ ^ IRR('^)) is not confluent. If tW halts 

on w after < m steps then A^B — >( 2 o) — >^ >u'qyv< — >^ qf for 

some j > I, u,v E E* . Hence the critical pair confluent. 

In all other critical pairs one of the rules (la) or (lb) is involved. Since qf is 
absorbing these critical pairs are also confluent. 

Now assume that F = {ai,... ,afe} and let V = {(</>(^),(/>(r)) | {i,r) E Tl), 
where the morphism <f> : F* {«,&}* is defined by (p{ai) = 

Then (i) P is length-reducing and can be calculated firom TZ in deterministic 
logarithmic space and (ii) V is confluent iff TZ is confluent, see [Loh99] for a 
proof of this fact. This proves the theorem. □ 

Using essentially the construction from the previous proof, the following result 
for the uniform word problem for the class of confluent and length-reducing 
STSs, which is known be in P by [Boo82], can be proven, see [Loh99]. 

Theorem 2. The uniform word problem for the class of confluent and length- 
reducing STSs over the alphabet {a, 6} is P-complete. 

In contrast to the problem COLR({a, b}*), which seems to be inherently sequen- 
tial by Theorem 1, for the more restricted problem COMO(Z'*) there exists an 
efficient parallel algorithm, see [Loh99] for the proof of the following theorem. 

Theorem 3. COMO(Z'*) is in NC^ for every finite alphabet E. 

This theorem follows from two facts; (i) The uniform word problem for e-fi:ee 
context free grammars can solved in NC^ [GHR95], p 176, and (ii) the problem 
whether a given pair of words is confluent with respect to a monadic STS can 
be reduced to the word problem for an e-fi:ee context free grammar [BJW82]. 

4 Vector replacement systems 

In [VRL98] it was shown that confluence is decidable but EXPSPACE-hard 
for the class of all vector replacement systems. Based on critical pairs, more 
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feasible upper bounds can be obtained for the length-reducing case. Similairly 
to STSs, also VRSs yield finite sets of critical pairs [BL81]. Let Tlhe & VRS in 
the dimension n. The set CP (72.) of critical pairs of 72 contains exactly all pairs 
((si, . . . , s„), (ti, . . . , fn)) such that there exist rules (fci, . . . , A:„) — >■ (pi, . . . ,p„) 
and (^ 1 ,... ,^n) (j’l)-" j^n) in 72 and for all z £ {1,... ,n} it holds Sj = 

mcac{ki,ii) — ki + pi and U = max{ki,ii) — u. Then 72 is locally confiuent 
iff all critical pairs are confluent. Note that there are at most |72| • (|72| - 1) 
many critical pairs that are not trivially confluent. For the length-reducing case, 
testing all critical pairs for confluence leads to a straight-forward PSPACE- 
algorithm for deciding confluence. In this section we will prove that confluence 
is moreover PSPACE-complete for the class of cill VRSs (without restriction 
on the dimension), i.e., (Jj, COLR(N*‘) is PSPACE-complete. Note that the 
calculation of a normalform of a vector n with respect to a length-reducing 
VRS may involve a nmnber of steps that is exponential in bzt(n). Therefore the 
calculation of normalforms for the finitely mciny vectors that occur in the finitely 
many critical pairs does not lead to a polynomial time algorithm (as it is the 
case for STSs). 

Theorem 4. COLR(N*') is PSPACE-complete. 

Proof. The following problem is known to be PSPACE-complete [Kar72]; Given 
a deterministic linear bounded automaton (briefly dlba) M and an input w for 
M, does M accept w7 Let us fix a dlba M = (Q, S, >, <, (J, qo, Qf) and an input 
w € (i^\{>, <})* for M, where Q is the finite set of states, S is the tape alphabet, 

> € 17 is the left-end marker, < 6 27 is the right-end marker, <5 : (Q\{9/}) x 27 -»■ 
<5 X 27 X {L, R} is the transition fimction, go £ Q is the initial state, and g/ is 
the unique final state. M accepts an input iff it finally reaches the final state g/. 
The transition function must be defined such that (i) the read-write head never 
moves to the left (right) of t> (<) and (ii) does not overwrite > (<) by a symbol 
different firom > (<) and (iii) does not overwrite a tape symbol a £ 27\{>, <} by 

> or <. We identify each tape cell of M with a number firom {0, . . . , |w| + 1}, 
where cell 0 always contains > and cell jw| + 1 always contains <i. We assume 
that A4 starts with the read-write head sccinning cell 0 and that the read-write 
head is always in cell 0 if the final state g/ is reached. 

We will construct a VRS 72 such that 72 is confluent iff M accepts the input 
w, which proves the theorem. Our construction is based on the simulation of a 
dlba by a Petri-net fi-om [JLL77]. Let the alphabet P be 

r — ({0, . . . , |ur| + 1} X Q) U ({0, . . . , |iu| + 1} X 27) U {A, $}. 

Note that we consider pairs (i,g) £ {0, . . . , |u;| + 1} x Q and pairs (i,a) € 
{0, . . . , |u;| + 1} X 27 as single symbols. The symbol (i, g) means that M is in 
the state g and the reeid-write head is sccinning cell i, whereas the symbol (i, a) 
means that cell i contains the tape symbol a. Since we assume that M. terminates 
iff it reaches the final state g/ and the reaid-write head is in cell 0, the presence 
of the symbol (0, g/) indicates that Ad has terminated. Let w = Uia 2 • • • «|u;|> 
m = \Q\ ■ |27|l’"l ■ (|r£;| + 2), and n = m + |tu| + 4. The iTj -dimensional VRS 72 
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consists of the following rules, where p,q € Q, 0 < i,j < |tc| + 1, and a,b G S 
(we use commutative words over F instead of vectors from for the definition 
of 71 in order to improve readabihty): 




The rules (la) and (lb) simulate M where the additional $ on the left-hand side 
makes these rules length-reducing. Rule (2) makes (0, g/) absorbing. Critical 
pairs that result from the rules (la) and (lb) can be resolved with the rules 
(3a) and (3b). In particular it is easy to see that the VRS that consists of the 
rules (la), (lb), (2), (3a) and (3b) is confluent. With the rules (4a) and (4b) we 
intentionally create a critical pair. The first rule (4a) produces the encoding of 
the initial configuration of M.. Since ecich simulation step of 7t consumes a $, 
we have to make enough $ available for the initial configuration. Since there are 
at most m = |Q|-|i7|l“’l'(|u;| + 2) different configurations for M, the dlba M 
either terminates after < m steps or loops forever. Thus m many $ suffice. Note 
that in the binary representation of 7L(M,w) the m many $ axe represented by 
0(ld(m)) = 0(/d(lQ|) + |w|-/d(|X'|)+W(|ti;|+2)) many bits, which is polynomial 
in jwl and the length of the description of M. The same holds for the number 
n = m + Iznl + 4 in the left-hand side of rule (4a), which is chosen such that (4a) 
is length-reducing. The proof that 71 is confluent iff A4 accepts w is similar to 
the proof of Theorem 1, see [Loh99] for the details. □ 

Whether confluence is also PSPACE-complete for VRSs in a sufficiently large 
but fixed dimension is left as an open question. Similarly to the semi-Thue 
case, also for VRSs the complexity of the confiuence problem decreases for the 
monadic and special case, see [Loh99]: 

Theorem 5. Ufc>o COMO(N'') is in NP and U*,>o COSP(N'') is in P. 

5 Special trace rewriting systems 

In [N088] a trace monoid M is presented such that COLR(M) is undecidable. 
This result was sharpened in [Loh98], where it was shown that COLR(M) is 
decidable iff M is free or free commutative. These results imply that in general 
TRSs do not have finitely many critical pairs (in contrast to STSs and VRSs). 
Furthermore these results motivate the question whether there exist restricted 
but non-trivial classes of (length-reducing) TRSs for which confluence is decid- 
able. In particular, in [Die90], p 154, it was asked whether confluence is decidable 
for special TRSs. We answer this question positively in this section. 
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Theorem 6 . COSP(M) is in P for every trace monoid M. 

Proof. For special VRSs the statement of the theorem is contained in Theorem 
5 . Thus let M = M(i 7 , /) be a trace monoid where I ^ (E x i 7 )\Idi; and let V, 
be a special TRS over M. Let NF be an algorithm that computes an arbitrary 
normalform NF{u,TZ) of a given input trace u with respect to TZ. Consider the 
following algorithm that we call SPECIAL: 



Input: A special TRS TZ over M(i 7 , 1 ) 

forall (pisqi, Pisq^) € dom{TZ) x domiJZ) with pilp2, 9 i 1 92 do 
nfi := NF(pi 91 , 71 ); nf^ := NF(p 292 , 7 ^); 
if n/i ^ n/2 then return “ TZ not confluent” (*) 
else 

u := n/i (= n/2); 

forall a €. E with alp^sqi or alp\sq2 do 
n/j := NF(au, 7 ^); n/2 := NF(na, 72 .); 
if n/j ^ n/2 then return “72 not confluent” (**) 
endfor 

endfor 

endfor 

return “72 confluent” (***) 

First we prove that 72 is not confluent if SPECIAL outputs “72 not confluent” . If 
SPECIAL executes line (*) then there exist piS9i,P2«92 € dom{TZ) with p\Ip2i 
and 91/92- Furthermore there exists a normalform Uj of pi9i {i G { 1 , 2 }) with 
Ui ^ U2- But then 72 is indeed not confluent since P2P1S9192 — P292 ^2 and 

P2Pisqiq2 = P1P2S9291 Pi9i wi- Now assume that SPECIAL executes 
line (*+). Then ui = U2 = u but there exists an o G such that either alp2sq\ 
or a I pisq2 and there exist a normalform vi of au and a normalform V2 of ua such 
that v\ ^ V2- Assume that alp2sq\. Then P2P\sq\aq2 P2dq2 = aP292 

au vi and p2Pisqiaq2 = PiP2sqiaq2 = piap 2 S 929 i Piaqi = Piqia 

ua — V2- Thus, again 72 is not confluent. The case that alp\sq2 can be dealt 
similarly by considering the trace P2apiS9i92 instead of P2Pisq\aq2- 

Now assume that SPECIAL outputs “72 confluent” in line (***). By induction 
on the length of traces it suffices to prove for all t G M that 72 is confluent on 
t if 72 is confluent on all t' with |t'l < |f|. Thus, let t G M and assume that 72 
is confluent on all t' with jt'l < |t|. We have to prove that all pairs (ti,t2) with 
t — ti and * <2 for some i,j >0 me confluent. The case i = 0 or / = 0 

is trivial. Assume for a moment that we have already considered aU cases with 
i = 1 = j. Then we can apply the arguments from the proof of Newman’s lemma: 
t si — fi and t S2 — ^2 imply Si s (i G { 1 , 2 }) for some s G M. 
Since |si| < |t| and si — t\, s\ — s it holds ti — u and s — u for some 
u G M. Since also |s2| < |t| and S2 <2, S2 ® ^ ** holds <2 ^ ^“d 

u — V, i.e., ti — u — V, for some v € M and the pair (ti,t 2 ) is confluent. 
Thus, it suffices to consider arbitrary factorizations t = ui£iVi = U2t2V2, where 
£1,^2 € dom{TZ), and to prove that the pair (uiui,U2^’2) is confluent. Lemma 1 
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applied to the identity uiiivi = U2^2i’2 gives nine traces Pi,qi,Wi, yi {i e { 1 , 2 }) 
and s such that (see also the diagram below) 



- £1 =Pisqi, £2 =P2S92, 


V2 


W2 


9i 


2/2 


- Ui == yiP 2 W 2 , U 2 = yiPlWi, Vi = Wiq2y2, V 2 - W2qiy2, 


£2 


P2 


s 


92 


- t - yiPlWiP2Sq2W2q\y2 = yiP2W2PisqiWiq2y2, 




2/1 


Pi 


Wi 


- Pllp2, 91-^92, WiIW2, WiIp2Sqi, W2lpisq2. 




til 


il 


Vi 



We show that the pair {yiPiWiW^qiy^, J/iP2«^2^f 1922/2) is confluent. If t/i 1 or 
1/2^2 then for t' = P2W2PisqiW\q2 it holds |t'| < t and t' P2W2Wiq2, t' — 
PiW\p2sq2W2qi PiWiW2qi - Thus the pair (piWiW2qi, P2W2W\q2) is confluent 
which therefore also holds for the pair {yiPiW\W2q\y2^ 2/iP2U)2U^i 922/2)- Thus it 
suffices to show that the pair (piit>iu) 29 i, P 2 «' 2 U^i 92 ) = {w2Piqi'Wi,wip2q2‘W2) is 
confluent. We have one of the situations that are considered in the outer forall- 
loop of SPECIAL. Since we assume that SPECIAL outputs “ 7 S confluent” we 
know that Piqi u {i S { 1 , 2 }) for some u 6 M. Hence W2P\q\W\ W2UW\, 
wip2q2W2 — W1UW2 and it suffices to prove that the pair {w2uwi,wiuw2) is 
confluent. The case iwi = 1 = u;2 is trivial. Thus, assume w.l.o.g. wi — wa, where 
a £ E. Since wilp2sqi it follows alp2sqi. Thus, a £ E is one of the symbols 
that are considered in the inner forall-loop of SPECIAL. It follows au 
and ua v for some v £ M. Thus 

wuaw2 — WVW2 and W\UW2 = wauw2 — wvw2- ( 1 ) 

Next let us consider t' = piwp2sq2W2qi = p\wl2W2qi {f results from t by re- 
placing the factor wi = wa by w). It holds |t'| < |t| and since w satisfies the 
same independencies as wi it holds t' = p2W2Pisqiwq2 = P2W2^iwq2- Thus 
t' -*Tl PlWW2qi = W2P\q\W W2UW, t' -^Ti P2W2Wq2 — WP2q2W2 WUW2- 

Hence W2UW x, wuw2 x for some x £ M and 

W2UW1 = W2uwa — xa cind wuaw2 — wuw2a — xa. ( 2 ) 

Finally since wuaw2 wvw2 by ( 1 ) cind wuaw2 xa by ( 2 ) eind \wuaw2\ - 
\w^uw2\ < liyiPi9iU’2| < \p\W1P2sq2W2q1\ = |t| (where the strict inequality 
follows from I2 = P2S92 7 ^ 1 ) it holds wvw2 — z and xa z for some z £ M. 
But then W\UW2 -^k. wvw2 z by ( 1 ) cind W2UW\ xa z by ( 2 ). Thus 
the pair (wiuw2,W2UWi) is confluent cind the correctness of SPECIAL is proved. 

Finally we have to show that SPECIAL runs in polynomial time. This follows 
from the following two facts: (i) For a fixed independence alphabet {E,I), the 
number of different factorizations £ = psq of a trace t is bounded by a polynomial 
in 1 ^ 1 . This follows from the fact that the number of prefixes of a trace t is 
bounded by a polynomial in |t| [BMS 89 ]. (ii) A normalform of a trace t with 
respect to a length-reducing TRS TZ, which is not a VRS, can be calculated in 
time bounded by a polynomial in | 2 | and | 7 ?.|| [Die 90 ] (in the algorithms in [Die 90 ] 
the TRS TZ is not part of the input, but it is easy to see that they run also in 
the uniform case, where the TRS is pcirt of the input, in polynomial time). □ 

We should mention that we proved a shght generalization of Theorem 6 in the 
long version [Loh 99 ] of this paper. One might ask, whether confluence can be 
decided also for arbitrary monadic TRSs. We leave this as an open question. 
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We close this section with a simple problem that is decidable for monadic 
STSs but in general undecidable for special TRSs. A trace u G M(i7, 1) is said 
to be connected if there does not exist a factorization u — vw with v ^ 1 ^ w 
and V I w. K set L C M(i7, 1) is connected if every u £ L is connected. A set 
L C M(I7, J) is recognizable if the set {s G 17* \ 3u £ L : u — [s]/} of all 
words that represent a trace in L is a regular word language. This is just one of 
several possibilities of defining recognizable trace languages, see e.g. chapter 6 
of [DR95]. A fundamental result of Ochmanski [Och85] states that the class of 
all recognizable subsets of M(I7, 1) is the smallest class C that contains all finite 
subsets of M(i7, 1) and that is closed under (i) union, (ii) concatenation of two 
sets (where the concatenation of L\ and L2 is = {uiU2 I G Ti, U2 G T2}) 
and (iii) the star-operator restricted to connected sets, i.e., if L belongs to C and 
is coimected then also L* — {uiU2---Un | n > 0,ui,U2, ■ ■ ■ ,u„ G T} belongs 
to C. It is known that for two recognizable word languages L\,L2 Q S* and a 
confiuent and monadic STS TZ it can be decided whether there exist U\ £ L\, 
U2 S L2 with u\ U2 [B085]. This decision problem is known as the extended 
word problem. For trace monoids the situation is quite diflFerent as the following 
theorem shows. 

Theorem 7. There exists a trace monoid M = M(i7, /), a special TRS Tl over 
M of the form 7?, = {o -> 1}, where a £ E, and a recognizable language L\ C M 
such that the following problem is undecidable: Given a recognizable language 
L2 C M, do there exist ui £ L\ and «2 G ^2 such that ui U2I 

Proof. It is well-known that the Post Correspondence Problem, briefly POP, 
is undecidable over the alphabet {0,6}. Let P = {(_si, ti), . . . ,(«n)in)} be an 
instance of the PCP, where Si,U £ {a, 6}*. Let {0,6} be a copy of {a,b} and 
let # 0 {a,b,d,b}. Let E — (a, 6, a, 6,#} and define an independence relation 
I on E hy I = {a, 6} x {a, 6} U {a, 6} x {a,b}. Note that # is depentant from 
every symbol. For a word s £ {a, 6}* the word s is defined in the obvious way. 
Let Li = {[a#a]/, [6#6]/}* and L2 = {[s»#t»]/ | 1 < * < n}~^. By Ochmanski’s 
theorem Li and L2 are recognizable. Let T?- = {# -4 1} which is confluent. Now 
it is easy to see that the PCP P has a solution iff there exist u\ £ Lx, U2 £ L2, 
and V £ IRR(T^) with ui v, U2 Since TZ is confluent, the last property 

holds iff ux U2. □ 
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Abstract. This paper studies the structural complexity of model check- 
ing for (variations on) the specification formalisms used in the tools CMC 
and Uppaal, and fragments of a timed alternation-free p-calculus. For 
each of the logics we study, we chareicterize the computational complexity 
of model checking, as well as its specification and program complexity, 
using timed automata as our system model. 



1 Introduction 

The extension of the model checking paradigm to the specification and verifica- 
tion of real-time systems has been thoroughly studied in the last few years. This 
extensive research effort has led to the development of specification logics that 
extend standard untimed formalisms with the quantitative analysis of timing 
constraints (see, e.g., [4, 15, 18]), and to important theoretical results setting the 
limits of decidability for model checking. This theory is now embodied in veri- 
fication tools like HyTech [23], Kronos [24] and Uppaal [20], which have been 
successfully applied to the verification of real sized systems. 

The successful application of the aforementioned verification tools to the 
analysis of realistic systems indicates that automatic verification of real-time, 
embedded software may be feasible in practice. However, despite many impor- 
tant theoretical results presented in op. cit., the literature is lacking a compre- 
hensive analysis of the structural complexity of model checking for real-time 
logics. In the untimed case, model checking algorithms with a polynomial time 
complexity, and often small space requirements, have been developed for several 
branching time temporal logics [8, 9]. In the timed case, most of the model check- 
ing problems considered in the literature are PSPACE-haxd [3, 10, 15]. Clearly 
the quantitative analysis of timing constrmnts increases the complexity of model 
checking, but it is interesting to analyze precisely in which cases this complexity 
blow-up occurs. In the untimed case, several papers (see, e.g., [13, 22, 11]) study 

* * * Basic Research in Computer Science. 



M. KutyJowski, L. Pacholski, T. Wierzbicki (Eds.): MFCS’99, LNCS 1672, pp. 125—136, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 




126 Luca Aceto and Francois Lfiroussinie 



in detail the effect of the temporal operators, the number of atomic propositions 
or the depth of operators’ nesting in the complexity of model checking, giving 
a better understanding of the complexity issue. Here, among other things, we 
address the same kind of problem for the timed case: what happens if time is 
inserted either only in the model or only in the formula? And what happens if 
we use less expressive logics with restricted operators? 

We consider several timed modd logics: has been introduced in [18], and is 

the specification language used in the tool CMC [17]; La is a fragment of which 
has been proposed in [19] in order to improve the efliciency of model checking in 
practice; SELL [2] and Lvs [1] have been introduced for their properties w.r.t. 
the testing timed automaton method that is cmrently used in verification tools 
like Uppaal to check for properties other than plain reachabihty ones. 

For each of these property languages, we study the computational complexity 
of model checking, using timed automata [5] as our system model. As argued by 
Lichtenstein and Pnueli [21], the complexity of the model checking problem can 
be measured in three different ways. First, one can fix the specification and mea- 
sure the complexity as a function of the size of the program being verified (the 
program complexity measure). Secondly, one can fix the program and measure 
the complexity as a function of the size of the specification (the specification 
complexity measure). Finally, the combined complexity of the model checking 
problem is measured as a function of the size of both the program and the spec- 
ification. In this paper we offer complexity results for these three different views 
of the model checking problem for the logics we consider. In so doing, we give an 
a posteriori justification, couched in complexity-theoretic arguments, for some 
of the folk beliefs in the area of model checking for real-time systems, and for 
some of the choices made by developers of real-time verification tools. 

Outline of the Main Results. We begin by analyzing the complexity of model 
checking for L^,^, a timed cdternation-free modal p-calculus (AFMC). In the 
untimed setting, such a fragment of the modal p-calculus plays an important 
role as a specification formalism because it is fairly expressive and its restricted 
syntax makes the symbolic evaluation of expressions very simple (more precisely, 
linear both in the size of the model and the specification). In the rezil-time setting, 
we show that the complexity of model checking for the timed AFMC, and for its 
sublogic Li,, is EXPTIME-complete, as are both the program complexity and the 
specification complexity. (Perhaps surprisingly, the model checking problem for 
— and a fortiori for the timed AFMC — is EXPTIME-complete even if we fix 
the model to be the inactive process without clocks, nil.) We also prove that the 
model checking problem for L„ without greatest fixpoints — essentially, a timed 
version of Hennessy-Milner logic [14] — is PSPACE-complete. 

It is instructive to compare the above results with similar ones for the im- 
timed alternation-free /i-calculus. As previously mentioned, for such a program 
logic, we have algorithms for model checking that run in time linear both in the 
size of the program and of the specification. Moreover, both the program and 
the specification complexities are P-complete [6,12]. Note, however, that the 
program complexity of the alternation-free /Li-calculus for concurrent programs 




Is your Model Checker on Time? 127 





Model checking 


Prog compl. 


Spec compl. 


Lj/ 


EXPTIME-complete 


EXPTIME-complete 


EXPTIME-complete 


Lsi SBLL^ L'is 


PSPACE-complete 


PSPACE-complete 


PSPACE-complete 


L- 


PSPACEl-complete 


P 


PSPACE-complete 


L7 


coNP-complete 


P 


coNP-complete 


SELL , 


PSPACE-complete 


PSPACE-complete 


coNP-complete 



Table 1. Overview of the Results 



is EXPTIME-complete [6], and this matches exactly the complexity results we 
offer for model checking. It is also interesting to note that the complexity 
of CTL model checking and reachability for concurrent programs is PSPACE- 
complete [6], matching the complexity of model checking for TCTL [4] and of 
reachability in timed automata, respectively. These results seem to provide a 
mathematical grounding to the folk belief that “clocks act hke concurrent pro- 
grams” , and that increasing the number of clocks corresponds to adding parallel 
components. 

We then proceed to develop a thorough analysis of the complexity of model 
checking for all the other timed modal property languages that we have found 
in the literature. In each case, we oflFer results pinpointing the program, the 
specification as well as the combined complexity of model checking for the prop- 
erty languages with and without fixpoints. An overview of the results we have 
obtained is presented in Table 1, where L~ denotes the fixpoint free fragment 
of L. Here we just wish to point out that the model checking problem for the 
property language L, is PSPACE-complete, no matter whether the complexity is 
measured with respect to the size of the program, of the specification or of both. 
In light of the aforementioned results, amd assuming that PSPACE is different 
from EXPTIME, the model checking problem for Ls has a lower computational 
complexity than that for L^. Our results thus offer a complexity-theoretic justi- 
fication for the claims in [19]. The source of the lower complexity derives from 
the observation that the model checking problem for L^, unlike that for L„, can 
be reduced in polynomial time to reachability checking in timed automata — a 
problem whose PSPACE-completeness was shown in [5]. 

2 Basic definitions 

We begin by briefly reviewing a variation on the timed automaton model pro- 
posed by Alur and Dill [5] and the property languages that will be used in this 
study. 

Timed Automata. Let Act be a finite set of actions, and let N and R>o denote 
the sets of natural and non-negative real numbers, respectively. We write T> for 
the set of delay actions (e(d) | d € M>o}- 

Let C be a set of clocks. We use B{C) to denote the set of boolean expressions 
over atomic formulae of the form x ~ p and x — y p, with x, y € (7, p 6 N, 
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and {<,>,=}. Moreover we write Bk{C) for the restriction of B{C) to 
expressions where the integer constants belong to {0, ... ,A:}. Expressions in 
B{C) are interpreted over the collection of time assignments. A time assignment, 
or valuation, v for C is a function from C to R>o . We write R>q for the collection 
of valuations for C. Given g e B{C) and a time assignment v, the boolean value 
g(v) describes whether g is satisfied by v or not. For every time assignment v 
and d £ R>o , we use v + dto denote the time assignment which maps each clock 
X £ C to the value v{x)+d. For every C C C, [C 0]u denotes the assignment 
for C which maps each clock in C to the value 0 and agrees with v over C\C. 

Definition 1. A timed automaton (TA) is a quintuple A — {Act, N, no, C,E) 
where N is a finite set of nodes, jiq is the initial node, C is a finite set of clocks, 
andE C N x B{C) x Actx2^ xN is a set of edges. The tuple e = {n,g,a,r,n') £ E 
stands for an edge from node n to node n' with aetion a, where r denotes the 
set of clocks to be reset to 0 and g is the enabling condition (or guard^. We use 
MCst(A) to denote the largest integer constant occurring in the guards of A. 

A state (or configuration) of a timed automaton A is a pair (n, v) where n is a 
node of A and u is a time assignment for C. The initial state of A is (no, [C — 0]) 
where no is the initial node of A, and \C 0] is the time assignment mapping 
all clocks in C to 0. The operational semantics of a timed automaton A is given 
by the Timed Labelled Transition System (TLTS) Ta = (<5a, Act U X>, — v), 

where Sa is the set of states of A, s® is the initial state of A, and — > is the 
transition relation defined as follows: 

(n, v) (n', v') iff 3(n, g, a, r, n') € E. g{v) = It A u' = [r -»• 0]n 

(n, v) (n', v') iff n = n' and v' = v + d 

Remark 1. Note that we could consider extended TAs where we assign an invari- 
ant (i.e. a downward closed clock constraint) to each node to avoid excessive time 
delays. All the results presented here will still hold for extended TAs. Note that, 
given a complexity class C, having a C-hardness result for (simple) TAs implies 
the same for extended TAs, while having a C membership result for extended 
TAs implies the same for TAs. 

The specification languages. We now define a timed alternation-free modal 
^-calculus. 

Definition 2. Let K be a finite set of clocks, Id a set of identifiers. The set 
of formulae over K and Id is generated by the following grammar: 

"-9 I I yjVV’ I {a)q> \ [a]q> \ 3q> \ '¥(f 
I K' xaf I max(A’, fp) \ min(A, p) \ X 

where a £ Act, g £ B{K), K' C K and A G Id. Moreover, each occurrence 
of an identifier X in a formula has to be bound by a min( A, p) ( or max{X, p) ) 
operator, and it cannot occur in a ip-subformula of the form max{X',if) (resp. 
min{X',‘ip)). (This restriction corresponds to the “alternation- free” property.) 
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{n,v,u) 1= [o] ifi 
{n,v,u) 1= Vip 
{n,v,u) 1=9 
(n, v,u) \= K' in i{> 
(n, V, u) \= maLx{X, tp) 



iff V(n',w'). (n,v) {n' ,v') => (n' ,v' ,u) \= (p 
iff Vd e R>o , {n,v + d,u + d) \= ip 
iff g{u) = tt 

iff (n, V, \K' 0]u) 1= (p 

iff (n, V, u) belongs to the largest solution of X = p 



Table 2. Semantics 



New operators like it, S, g => V* (read ‘9 implies ip') can be easily defined. 
Let MCst(y)) be the largest integer constant occurring in the clock constraints in 
<p. Given a TA A, we interpret formulae in w.r.t. extended configurations 
(n, V, u), where (n, v) is a configuration of A and u is a time assignment for K. 
Whereas the classical modal operators (a) and [o] deal with action transitions, 
the operator 3 (resp. ¥) denotes existential (resp. universal) quantification over 
delay transitions. The clocks in K are so-called formula clocks; they increase 
synchronously with the automata clocks, and they are used as stopwatches for 
measuring the time elapsing between states of the system. The formula K' in ip 
initializes the set of formula clocks /f' to 0 in ip. The constraints g are used to 
compare the value of formula clocks in the current extended configuration with 
an integer value. Finally, an extended configuration satisfies max{Z,g>) (resp. 
{min{Z,g})) if it belongs to the largest (resp. least) solution of the equation 
Z = tp over the complete lattice of sets of extended configurations. The existence 
of these solutions is guaranteed by steindard fixpoint theory. The semantics of 
is sketched in Table 2. (The operators (o) and 3 axe duals of [o] and ¥; 
the semantics of boolean operators is omitted.) The full formed details of the 
semantics axe standard [16]. 

As an example of a property that cein be expressed in using fixpoints 
and clock constraints, consider the formula 

max^X, ^[6]{a;} in 3((c) tt A x < 3)^ A [o]X A ¥X^ . 

This formula expresses the fact that, in every state that is reachable by per- 
forming a-actions and delays, every occurrence of a 6-action can be followed by 
a c-action within 3 time units. 

Pmgments of The logic Ly [18] is the fragment of in which only 
greatest fixpoints are allowed. The logic L, [19] is the firagment of without 
the existential modalities (a) and 3, and where only a restricted disjunction of 
the form 9 V (with 9 € B{K)) is allowed. 

The property laxiguages SELL and Lys extend Lg, and use a slightly different 
kind of TAs where (1) W is a subset of Act s.t. any edge labeled with a gU has the 
guard tt and (2) Act contains the label r used for the internal action of automata. 
Moreover they axe based on different semantics (denoted by b) compared with 
Ly and Lgi a formula (p holds for {n,v,u) only if p holds for every {n',v',u) 
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with {n',v') reachable from {n,v) in zero or more r-transitions. For example, 
{n,v,u) h [a]tp iff for every {n,v) (n',v') we have that {n',v') {n",v") 

implies {n”,v",u) h <p. Moreover {n,v,u) h ¥ip iff for every {n',v') reached from 
{n,v) by using r-transitions and delay transitions (of total duration d), we have 
(n', v',u + d) h (f. 

SELL extends Lg by allowing the use of (a) tt subformulae with a £ U. L'^s 
extends SELL with new operators ¥5 with S C li. A formula ¥s^ holds for 
{n,v,u) iff If holds for any {n',v',u + d) s.t. {n',v') is reachable from {n,v) by 
using only r-transitions and delay trzmsitions (with total dmation d), but delay 
transitions occur only in states in which none of the actions in S are enabled. 
These two languages can be translated into L„ in the following sense: for any 
ip € LvS) there exists an formula Tp s.t. A\~ ip \S A For example, we 
have that [a]V’ = max(X, [a]V> A [r]X). An important property [2, 1] of SELL 
and Lvs is that their model checking problem can be reduced to a reachabihty 
problem: for any formula <p of these languages, we can build a testing automaton 
s.t. A h iff a reject node is not reachable in the parallel composition 
(A|T^). Moreover it has been shown that Lvs is expressive enough to encode 
any reachability property [ 1 ]. 



Verification of timed systems. Automatic verification of timed systems is possi- 
ble despite the uncountably infinite munber of configurations associated with a 
timed automaton. The decision procedure iox A^ p is based on the well known 
region technique [4]. Given A and p, it is possible to partition the infinite set 
of time assignments over C'^ = C U K into a finite number of regions in such a 
way that two extended configurations (n, u) and (n, v), where u,v £ R>q are in 
the same region, satisfy the same formulae. Formally the regions can be defined 
as the equivalence classes induced by the equivalence relation over valuations 
defined thus: given u,v £ R>q , u and v are in the same region iff they satisfy 
the same clock constraints in where M = max(MCst(A),MCst(^)). 

We write [u] for the region which contciins the time assignment u, and use 
to denote the (finite) set of all regions for a set Cl of clocks and the maximum 
constant k. Given a region [u] in and C C Cl, we define the reset operator 
thus: [C 0][u] = [[C7' — > 0]u]. Moreover, given a region 7 , its successor region, 
denoted by succ{'y), is the region 7 ' s.t. for any u 6 7 there exists S £ K>o with 
[u-l-5] = 7 ', and [u-f J'] £ { 7 , 7 '}, for every S' < 6. The region succ[pf) is different 
from 7 iff 7 ( 2 : < A:) = It for some clock x. 

Now, given a timed automaton A — {AcX, N, no, C,E), a set K of formula 
clocks and an integer constant M with M > MCst(A), we can define a sym- 
bolic semantics [18] over the finite transition system (5, — ^), called region graph, 
defined thus: S — N x and (IJ^ -^)U The symbolic se- 

mantics is closely related to the standmd one: for every formula whose 
clock constraints do not use constants greater than M, and u € 7 , (n, 7 ) |= 
iff (n, u) 1= p. Therefore each instcince of the timed model checking problem 
can be reduced to an untimed model checking query over the region graph. 
Note that the size of TZ'^ is in 0(|C''''|! • I). Moreover for any region 7 , 
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1{7'|7' = succ'{'y), i S N}| < 2 • IC+I • (M + 1). 

The reachability problem, which is a fundamental question in system verification, 
is know to be PSPACE-complete [3, 10]. Moreover the model checking problem 
for TCTL (a timed extension of CTL) is PSPACE-complete [4]. 

We shall use the abbreviations A+t F !^+t, A p 'i’+t and A+t ^ ^ to 
denote, respectively, the model checking problems where clocks are allowed both 
in automata and specifications, where clocks are allowed only in specifications 
and where clocks are allowed only in automata. 



3 Complexity results for model checking 

We now consider the complexity of model checking (MC) for the property lan- 
guages introduced previously. These results require to define what are the size of 
a timed automaton A — {Act, N, no, C,E) and a formula ip E L^,i/. The size \fp\ 
of a formula is its length. We define |A| as \N\ -t- [Cl -I- |E| -t-MCst(A) -I- UeeBlSel) 
and assume a binary encoding for the elements of the sets N and C. Consider- 
ing constants represented in unary or binary does not change our results except 
when it is explicitly mentioned. 

Theorem 1. The complexity of and model checking is EXPTIME- 
complete. Moreover, we have that the specification and program complexities of 
and L„ model checking are also EXPTIME-complete. 

Proof. EXPTIME membership: We have seen that A\= (p \S A\= <p where 
A is an untimed automaton (the region graph) whose size is exponential in |AJ 
and over which (p is interpreted as an untimed formula. If we modify slightly A 
by adding the transitive closure of the size of the resulting automaton is 
still exponential in |A|, and 3 and ¥ become “one step” modalities. Then ^ is a 
simple (untimed) alternation-free ^-calculus formula for which model checking 
is linear in |A| and \ip\ [9]. This gives the EXPTIME membership for and 

u. 

EXPTIME-hardness: Deciding whether a given linear bounded alternating 
Turing machine (LBATM) M accepts a given input string w is EXPTIME- 
complete [7], and it can be reduced in polynomial time to a MC problem Am. t= ^ 
with ^ Ehu- The main idea is that we can build a TA Am over actions s and 
accept s.t. any s-transition of Am corresponds to a step of M. due to the 
tape boundness (see [3, 10]). By following the same approach proposed in [6] for 
untimed concurrent systems, the alternating behaviour^ of M can be handled by 
an formula of the form: <? = max(X, [accept]fE A V [s] 3 (s) X). Intuitively ^ 
holds for Am if the current “or” state is not an accepting state and after any step 
(leading to an “and” state), there exists a transition leading to a non-accepting 
“or” state and so on. We have Am ]= ^ iff the LBATM M. does not accept w. 
This gives the EXPTIME-hardness for and 

^ We assume w.l.o.g. that we have a strict alternation of “or” and “and” states in At, 
and that the initial and final states are “or” states. 
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Fig. 1. The automaton A(^) with (p — ip\pi Xi = n — r,pi •<— x< = n]. 



Specification complexity: In fact the acceptance of i/; by a LBATM M can be 
reduced in polynomial time to a problem of the form nil \= ^m,w where ^m,w is 
an L„ formula. This encoding is based on the use of formula clocks to represent 
the configurations of Ad. This gives the EXPTIME-hardness. 

Program complexity: This is due to the proof of EXPTIME-hardness for L„ 
model checking where the formula ^ = max(X, [accept]ff A ¥ [s] 3 (s) X) does 
not depend on the LBATM Ad. □ 

Remark 2. In [15], a timed /i-calculus has been proposed, and MC for was 
shown to be PSPACE-hard. is more expressive than because it allows 
for fixpoint alternations and it uses a powerful binary operator > (instead of our 
modalities (a) and 3). In fact the proof of Theorem 1 can be adapted^ to 
and this yields an improved lower bound on the complexity of MC. Moreover, 
using techniques from [6], we can prove that the MC problem for (and the 
extension of with alternations) is in EXPTIME, and is thus EXPTIME- 
complete. To the best of our knowledge this is the first precise characterization 
of the complexity of MC for this logic. 

Theorem 2. The model checking problem for L“ is PSPACE-complete. More- 
over the specification complexity ofL~ MC is PSPACE-complete. The program 
complexity of L~ MC is in P, if the integer constants in the automata are rep- 
resented in unary. 

Proof. PSPACE membership: A nondeterministic model checking algorithm 
in PSPACE can be easily defined by considering the parts of the region graph 
associated to A\= p only when they are required. The difference with is that 
we do not need to compute arbitrary sets of configurations for fixpoints. 
PSPACE-hardness: Let = QiPi . . .QnPn P b® instance of the QBE 
(Quantified Boolean Formulae) problem, where each Qi € {3, V} and is a 
propositional formula over the pi's. We reduce the validity of ^ to a model 
checking problem. Consider the TA A^^-) in Figure 1 and the L~ formula 

S = 3((ai) It A Oi(3((a2) tt A O 2 • • • 3((an) tt A On{sat) tt)))) 

where Oj is (uj) (resp. [uj]) if Qi is 3 (resp. V). Clearly A(,^) |= # iff ^ is valid. 
Specification complexity: In fact any QBF instance can be encoded as a 

^ For ex. by considering a formula like: /iX.accept V[tt> (po A -i(tt>(pe A -iX)))j where 
Pe (resp. po) marks even (resp. odd) states. 
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problem of the form nil )= with ^ € LJ, by using formula clocks. This entails 
the PSPACE-hardness of specification complexity. 

Program complexity: Let lys be a given L“ formula. We can define a poly- 
nomial (in \A\) algorithm by building the pertinent part of the region graph in 
an “on the fly” manner. The key points are that (1) deciding if ip holds for a 
TA A needs to consider only sequences with at most \<p\ action transitions and 
(2) between two action transitions the number of possible delay transitions is 
bounded by 2 (|Ca| + lA’|)(max(MCst(A),MCst((^)) -t- 1) which is polynomial in 
|A| if MCst(A) is given in unary. The time complexity of such an algorithm is in 
0(|A|2M) and, as y; is fixed, the program complexity is in P. □ 

Note that some of our proofs are based upon the realization that the MC prob- 
lems of the form nil |= tp (where is a formula in any of the logics considered so 
far) are just as hard as the MC problems for arbitrary TA. Thus the worst-case 
complexity of MC for these real-time logics may be seen as deriving solely fi:om 
the use of clocks in formulae. This pattern will remain true for all the property 
languages we study in what follows, except SBLL~ and L^g. 

The property language has been introduced in [19] as a sub-language of 
that allows for more efficient model checking algorithms. To the best of our 
knowledge, however, such an intention has not been supported yet by precise 
complexity theoretic considerations. These we now proceed to present. 

Theorem 3. The complexity of MC is PSPACE-complete. Moreover, the 
specification and program complexities of Lg MC are also PSPACE-complete. 

Proof PSPACE membership: For every L* formula p, it is possible to build 
a TA T<^ such that, for any TA A, A\= p \S a reject node of is not reachable 
in the parallel composition (A|T^) [2]. The size of is hnear in that of p and 
(A|T^) can be seen as a new TA A corresponding to the product AxT^p. The 
reduction of A [= y; to a reachability problem for A is done in polynomial time, 
and thus gives the PSPACE membership. 

PSPACE-hardness: A reachability question for node n in a TA A can be re- 
duced to checking that A ^ max(A’, [in_n]ff A [a]X A VA) if we suppose that 
every edge in A has label a, except for a new transition (n, tt, in_n, 0, n). 
Specification complexity: It is possible to reduce reachability in a linear 
bounded nondeterministic Turing machine M with input lu to a problem of 
the form nil ]= ^m,w by means of the same kind of encoding used for L.^. 
Program complexity: It is PSPACE-complete because the formula expressing 
the reachabihty problem does not depend on the input automaton. □ 

Theorem 4. The model checking problem for L~ is coNP-complete, as is the 
specification complexity of model checking. The program complexity of L~ is in 
P, if the constants in the input automata are represented in unary. 

The property languages SELL and Lys have the same complexity: 

Theorem 5. The complexity of SELL and L'is model checking is PSPACE- 
complete. Moreover we have that the specification and program complexities of 
SELL and L\/s MC are also PSPACE-complete. 
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L — > L' stands for 

"L is more expressive than L'' 



PSPACE 



coNP 



Fig. 2. Expressiveness vs complexity of model checking 



For the property languages SELL and we obtain the following result: 

Theorem 6. The MC problem for SB LL~ and L^g is PSPACE-complete. The 
specification complexity of MC for SBLL~ and L^g is coNP-hard, and is coNP- 
complete if constants in the formulae are represented in unary. Finally, the pro- 
gram complexity of MC for SBLL~ and L'^s is PSPACE-complete. 

There is an implicit recursion (over r and delay transitions) which is hidden in 
the semantics of the SBLL~ operator V, and this recursion is suflBcient to make 
SBLL~ and L^g model checking PSPACE-hard. 

Concluding remarks. The relationships between the relative expressive power 
of the property languages that we have considered, and the complexity of their 
model checking problems is summarized in Figure 2. (There L — >■ L' means 
that any model checking problem A \= (p with p E L' can be reduced in linear 
time to a verification A\= p with (p € L.) 

Note that, for every specification language we consider, the proof of C- 
hardness of the MC problem uses formulae without clocks. This implies that 

7 7 

the problems A+t p ^ and A+t F ^+t bave the same complexity. The remark 
about the complexity of MC problems of the form nil \= p shows that A p ^+t 

7 

and A+t F ^+t a^so have the same complexity. Therefore the complexity of MC 
does not depend on whether time is added to the model, to the specification or 
to both. 
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Abstract. In this paper we consider proof techniques for branching- 
time temporal logics. While a considerable amount of research heis been 
carried out regarding the relationship between finite automata and such 
logics, practical proof techniques for such logics have received relatively 
little attention. Recently, however, several applications requiring refined 
proof methods for branching-time temporal logics have appecired, most 
notably the specification and verification of multi-agent systems. Thus, 
here we extend our clausal resolution method for linear-time temporal 
logics to a branching-time framework, in particular to the powerful CTL* 
logic. The key elements of the resolution method, namely the normal 
form, the concept of step resolution and a novel temporal resolution 
rule, are introduced, justified, and applied. 



1 Introduction 

A proof method based upon clausal resolution has been developed for linear 
discrete temporal logics [10] and has been shown to be particularly amenable to 
efficient implementation [6]. It is based upon a normal form that can potentially 
represent a range of temporzJ logics, utilising a variety of model structures [11]. 
For example, in [4] we extended the resolution method to the (comparatively 
simple) branching time temporal logic CTL [5]. 

We here consider the extension of this approach to the more powerful bran- 
ching-time logic CTL* that is now being applied, for example, within the spec- 
ification and verification of multi-agent systems [15]. The key elements of the 
method, namely the normal form, the concept of step resolution and the form of 
the temporal resolution rule, are introduced Emd justified with respect to CTL*. 

Our approach follows the observation that, as the branching structures char- 
acterised by CTL* consist of a set of linear paths, we can use our linear-time 
temporal resolution along a given path, while using an additional mechanism in 
order to cope with resolution between paths. This is achieved by extending the 
normal form so that each tempored formula is labelled with an index identifying 
the path on which it is relevzmt. Thus, resolution can only be carried out between 
two formulae if their indices ‘match’. 

The structme of this paper is as follows. In §2 we briefly outline the S 3 mtax 
and semantics of CTL*. In §3 we define a normal form used for CTL* formulae, 
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introducing the notion of indices and providing the context over which the linear- 
time operators range. In §3.1 the interpretation of such indexed formulae is given. 
Then in §3.2 we describe the algorithm to transform arbitrary CTL* formulae 
into the normal form, and in §§3.3-3. 6 we present the range of transformation 
rules. An example of the transformation of a CTL* formula to its normal is given 
in §3.7. In §4 we introduce a resolution method and the range of resolution rules. 
In §4.5 we give an example of the resolution refutation. Correctness arguments 
are outUned in §5. Finally, in §6, we provide concluding remarks and discuss 
future work. 

2 Full Computation Tree Logic — CTL* 

The syntax of CTL* distinguishes state (S) and path (P) formulae. These are 
defined inductively as follows. 

5 C I true | false |5 a5|5v 5|5-^-5|-5|AP|EP 
P :;= 5 I PAP I PVP I P=>P hP I DP | 0^* | OP | PUP | PWP 

Here, C is any well-formed formula of propositional logic, true and false are 
constants, A (‘on all future paths starting here’) and E (‘on some future path 
starting here’) are branching-time path operators, and □ (‘always in the fu- 
ture’), 0 (‘at some time in the future’), O (‘at the next moment in time’), U 
(‘until’), and W (‘unless’) are linear-time temporal operators. Thus, the very 
expressive language of CTL* allows us to represent such complex properties as 

AO(OPAEO-'P). 

Following [8], we interpret a well-formed formula of CTL* in a tree-like model 
structure M = {S, R, L), where 5 is a set of states, P C 5 x 5 is a binary 
relation over 5 such that there is a state so which is the root of the struc- 
ture’s tree (i.e. Vj. ((sj,so) ^ R)) cind every state has at least one successor 
(i.e. Vi. 3j. ((sj,Sj) G R)), and L is an interpretation function mapping atomic 
propositional symbols to truth values at each state. 

Before continuing with the semantics of CTL* we first introduce some nota- 
tion. 

A path, XsiJ over R, is a sequence of states Sj, Sj^.i, Sj+2, . . . , . . . such 

that Vj > i. (sj, Sj+i) G P. A path x»o cailled afullpath. Given a path Xs, and a 
state Sj G Xsji * < J: we term an infinite sub-sequence Sj, Sj+i, Sj+ 2 , . . . ( sj+k S 
Xsi, for A: = 0, 1, . . . ) a suffix of a path Xs, abbreviating it with Suf{xsi,sj). We 
can now give, in Figure 1, the definition of the satisfaction relation ‘)=’. 

Definition 1 (Satisfiability). A well-formed formula, B, is satisfiable if, and 
only if, there exists a model structure A4 such that (A4,Sq) |= B. 

Definition 2 (Validity). A well-formed formula, B, is valid if, and only if, B 
is satisfied in every possible model, i.e. for each model structure A4, (A4, Sq) |= 
B. 

Note that the CTL* semantics above requires that, when interpreting atomic 
formulae on some path Xs, j we are referring to the behaviour in the first state si 
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(M,si) \=p 


iff 


p S L{si), for atomic p 


(M,Si) 1= 


iff 


(M,Si)^A 


(M,Si) \=AAB 


iff 


(At, Si) )= A and (At, Si) [= B 


{M,Si) 1= AVB 


iff 


(At, Si) f= A or (At, Si) |= B 


(M, Si) \= A^ B 


iff 


(At, s<) ^ A or (At, Si) |= B 


(At, Si) \= AB 


iff 


for each 1= B 


{M,Si) \=EB 


iff 


there exists such that {M,Xsi) |= B 


(M,Xsi) \= A 


iff 


(At.Sj) 1= A, for state formula A 


(M,Xsi) h -'A 


iff 


(M,Xsi) ¥= A 


{M,x>i) \= aab 


iff 


(M,X3i) \= A and (At,x»i) N B 


{M,Xs,)^AyB 


iff 


(At,x«i) 1= A or {M,x>i) N B 


(M,Xsi) \= A^ B 


iff 


{M,x.i) V=Aor {M,Xsi) \= B 


{M,x^i) \= OB 


iff 


for each Sj € if * < i then 






{M,Suf{xs„Sj))\=B 


{M,x>i) 1= 0-B 


iff 


there exists Sj e X‘i such that i < j and 






{M,Suf(x>i,Sj)) 1= B 


{M,xs,) N OB 


iff 


(M,Suf{x3i,Si+i)) \= B 


(M,Xsi)\=AUB 


iff 


there exists sj € such that i < j and 






{M,Suf(x 3 i,Sj)) \= B and for each Sk € x»i 






if i < A: < J then (At, 5u/(x«i ,Sk))) ^ A 


(M,Xsi)\=AWB 


iff 


(M,X3i) t= OA or (At,x.i) |= Alt B 



Fig. 1. CTL* semantics 



of this path. This causes the failure of the substitution principle for proposition 
symbols and hence induces some restrictions on the renaming procedure which 
plays a significant role in our resolution method. Further, note that in [9] it was 
shown that any CTL* formula G can be transformed to a particular form G' 
where the nesting of path quantifiers is at most 2 and where j= G' iff G. 
This is achieved by repeatedly renaming the state subformulae of G. We later 
utilize this property in transforming formulae (§3.2). 

3 Normal Form for CTL* 

The normal form we use for CTL* is Ccilled SNFc*. To define SNFc*, we must 
extend the CTL* language slightly. Firstly, we introduce a new constant start : 

{M, Si) 1 = start iff i = 0 

Secondly, we introduce indices in order to express the path context of a temporal 
formula. In genercil, the index of a formula will tell us whether we can reason 
about this formula in the context of all paths (any arbitrary path) or in the con- 
text of some specific path. Indices are based on the two sets VAR = {a, 7 , . . . } 

representing “path variables” and FUN = {f,g,h, . . .} representing unary path 
functions. The set of path expressions, IND, is made up from members of VAR 
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and expressions of the form f{a), where / € FUN and a £ VAR. Note that 
indices of the form f{g{a)) can not be obtained as part of oiu: transformation 
procedure (see §3.8). We will also see later (§3.4) that path expressions of the 
type f{a) appear due to the removal of the E quantifier and play a similar role 
to the Skolem functions in predicate logic. 

A formula in SNFc* is of the form 

n 

An 

i=l 

where each “Pj => Pj” (called a rule) is further restricted to be one of the 
following: 

i 

start ^ \J Qk (an initial rule) 

k=l 

m I 

A PHvar) ^ O 

i=l k=l 

m 

A Pi<var) (a Sometime rule) 

i=i 

Note that, pj, qj and I are literals, (var) and (ind) are indices providing the path 
context for the rules, (var) £ VAR, and (ind) £ IND. Note also that, in an indexed 
formula such as C(indi), index (indi) relates to the whole formula C. 

3.1 Interpreting Indexed Formulae 

Any SNF c* rule can be interpreted as a constraint upon the branching structure 
(remembering that SNFc* rules are in the scope of an implicit ‘A □’). As a sim- 
ple example, consider the set of rules { start => x, => 0 ^(/(/ 3 )> > true 
0~'Z(a> }• Indexed formulae of this set are interpreted in relation to some model 
Ai as follows: 

— The initial rule start => a; is understood as “x is satisfied at the initial state 
of M”. 

— The ‘sometime’ rule => can be interpreted as “for any fullpath 

X and any state Sj £ (indicated by index {13} £ VAR) if x is satisfied at a 
state Si then must be satisfied along some path TTg. (indicated by index 

— The step rule true => O can be interpreted as “for any fullpath x 
and any state Sj £ Xj if true is satisfied at a state Sj then Q-'Z must be 
satisfied along any path TTg/’. 

3.2 Algorithm for Transforming CTL* Formulae to SNFc* 

We now consider how an arbitrary CTL* formula can be transformed into SNF c* • 
Preserving or changing a path index introduced at some stage of the transfor- 
mation procedme is important in formulating the transformation rules. 
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To check the vahdity of some CTL* formula, F, we first negate it and push 
the negations into F until they are apphed to propositions. This is based on 
negations of classical logic operators in addition to the following equivalences. 



-.AP = E-.P --EP = A-.P -0P=0->P 
-.(pw Q) = ^Qw (-P A -Q) -Op = D-P 

—(P W Q) = ~^QU (— P A ~‘Q) — QP = 0“'P 



This gives us a formula G such that (Al, sq) 1= P iff (Al, sq) |= G. Then the 
transformation procedure, r, is applied to G, giving t[G] = T 2 [ti[G]] where ti 
and T 2 axe described by the steps 1-3 and 4-8 below, respectively. 



1. Anchor G to start , obtaining start G. 

2. Apply the initial renaming rule obtaining 

A □ (start =4- xo) A A □(xo(var) ^ <J{var))i where xq is a new proposition. 

3. Reduce the nesting of path quantifiers in G gradually renaming the deepest 
embedded state subformulae (see §3.3). 

This reduction of the nesting of state formulae in G is based on the method 
defined in [9]. Once this has been carried out, we obtain a set of constraints 
of the form A □ (P(var) ^ PC'(var) ) or A □ (P(var> C'(var> ) where P is either 
of the path operators and C a formula without path operators. 

4. Remove path quantifiers (see §3.4); this provides a context for the renaming 
of path formulae in the next step. 

5. Rename path formulae (see §3.3), preserving the path context, in order to 
reduce the nesting of temporal operators to exactly 1 and so that every 
temporal operator applies only to literals. 

6. Remove temporcd operators (see §3.5). 

7. Introduce a temporal context (see §3.6), where necessary, use classical equiv- 
alences to rewrite rules into their correct form. 

8. Rename path variables apart, again if necessary, to ensme that no path 
variable occurs in two different rules. 



3.3 Renaming rule 

We extract subformulae from within a complex formula as follows. 

^(indi) ^ Q(-^){ind2) ^ {-^(indi) ^ Q(-^/^)(ind2) ! ^{indj) ^ ■^{ind2) } 

Here, Q{R) means “R is a subformula of (Q)”, Q{R/x) means a result of replac- 
ing R by a new proposition symbol x in Q and (inds) = (ind 2 ) if (ind 2 ) 6 VAR, 
but if (ind 2 ) = then (inda) = (a). 



3.4 Removal of Path Quantifiers 



These rules introduce new indices representing a path context of the appropriate 
type. 



Removal of A; 

■P(indi) ^ 

■P{indi> ^ .f{ind 3 ) 



Removal of E: 

■f^Qndi) ^ FjF (ind2> 

■P(indi> ^(/(ind 2 )> (f € FUN) 



(inds S VAR) 
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3.5 Removal of Temporal Operators 



Here we must correctly maintain a path context while reducing temporal oper- 
ators using their fixed point definitions. In the formulation of the removal rules 
below ‘x’ is a new proposition symbol, (inds) E VAR, and (inds) = (ind2) if 
(ind2) E VAR but if (ind2) = (/(o)) then (inds) = (a). 



Removal of ‘always’: 

■P(indi) ^ 

^(indi) ^ (■^Ax)^|nd,^ 

^(inds) ^ O A x)^lndj^ 



Removal of ‘unless’: 

■Pfindii ^ 

^*<indi) => (G V (F A x))(indi> 
a:(ind 3 ) ^ 0{GV{F ^ x))(indj) 



Removal of ‘until’: F(indi) {FU 

■P<indi> => (G V (F A x))(indi) 
X^indj) ^ 0(G V (F A x))^jnd2) 

F^indi) -^0G( 

ind2) 



3.6 Introducing a Temporal Context £md Simplifying 

Since, for a pinrely classical formula, ‘F’, it is the state, rather than the current 
path, that is important in establishing satisfiability, we can apply the following 
rules (where F is classical and ‘P’ is either of the path quantifiers). 

Temporising: QdndO ■P’ond;) 

start =$■ (->Q V F) 
true (ind,> =» 0(^Q V F)(ind>) 

Further, we use a number of transformations that correspond to the follow- 
ing equivalences of CTL*: 0(F A Q) = OF A OQ, EAF = AF; AEF h 
EF, AAF = AF, EEF = EF and obvious simplifications Tfalse = false , PT 
false = false, where ‘P’ is either of the path quantifiers eind ‘T’ is any unary 
temporal operator. Finally, we utilize trcinsformations applied to obtain normal 
form in classical logic; we term this set of additional simphfications SIMP and 
apply them wherever required. 

3.7 Example Tremsformation 

Let us consider the steps required to trcinsform the formula A0(Op A EO“'p) 
(whose unsatisfiability is not immediately obvious) into SNFc*. 

Recall that, following the transformation algorithm, we first apply the initial 
renaming rule (steps 2 and 3 below) and then rename the deepest embedded 
state subformula (steps 4 and 5). 

1. start => A<0>(Op A EO“'p) Given 

2. start => X 1, Initial Renaming 

3. X(q) => A<0’(Op A EO“'p)(a> li Initial Renaming 

4. A<>(Op A p)(c) 3, Renaming 

5. P(a) EO-'P(c) 3, Renaming 
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As the nesting of path quantifiers is now of depth 1 (in rules 4 and 5), we remove 
these quantifiers (steps 6 and 7) thus generating a certain path context for pmely 
path formulae. 

6. X(a) => (}iOp/\y){s) 4, A Removal 

7. y{a) => 0-^p</(a)> 5, E Removal 

Now we can rename a purely path formula ( Op A y) which occurs in the context 
0(Op A y)(^s) (steps 8 and 9) and then apply simphfication and temporising 
rules. 



8 . 


^{a) ^ 


6, Renaming 


9 . 


Z{s) => iOp/\y){s) 


6, Renaming 


10. 


Z{S) OP{S) 


9, SIMP 


11. 


Z(S) ^ y{S) 


9, SIMP 


12. 


start => (-12 V 2/) 


11, Temporising 


13. true (i) => O (-'2 V y)(s) 


11, Temporising 



Finally, taking 2, 7, 8, 10, 12 and 13 above and renaming path variables (in 7, 
10 and 13) apart we obtain the desired set of SNFc* rules. (In §4.5 we present 
a resolution proof for this set of rules, repeating it as a part of such proof). 

3.8 Features of SNFc* 

Here we summarize some important features of the transformation procedure 
that can be proved straightforwardly firom the transformation algorithm given 
above. 

1. Labels of the type (/(ind)) can appem only on the right hand side of a rule 
and therefore any index on the left hand side is Eilways a path variable. 

2. If the index of the right hand side formula is (/(ind)) then the left hand side 
is always labeled by (ind) which ensmes the fink between left and right hand 
side indices. 

3. We can not obtmn indices where an cirgiunent of a function is itself a function, 
i.e. of the type {f{g{a))). Note however, that use of the indices of the type 
(/(ind)) is crucial as a function symbol / is a syntactical indication of a 
certain path context (see §3.1). 

4. On the left hand side of formulae that appear during the transformation 
procedme we can only have an expression which is either of the following 
types; start , true , a literal or a conjunction of literals. 

3.9 Correctness of the transformation 

The following theorem (see [3] for details) characterizes the correctness of the 
transformation procedme r. 

Theorem 1. A well-formed CTL* formula, G, is satisfiable if, and only if, r{G) 
is. 
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4 Resolution procedure for CTL* 

Once the original formula has been trcinsformed into SNFc*, there are two 
possible situations in which resolution rules can be applied. The first type of 
resolution rule, which we call step resolution, (see §4.1), is apphed when hterals 
I and ~>l occiu: at the same moment of time on the same branch. The second 
type of resolution rule, called temporal resolution (see §4.2), can be apphed when 
some proposition I can occur at all future moments on a path (a situation known 
as a loop), while I can be also constrained to be false at some point in the future 
of the same path. 

In both cases, the use of indices is crucial as they express the required path 
contexts. As we will see below, we must be able to unify indices in order to 
carry out resolution. Unification between indices is the same £is the unification 
of terms in classical logic. 

4.1 Step Resolution 

With I as a literal and ^((inda), (indb)) as an abbreviation for the unification of 
two indices, we have the foUowing step resolution rules. 

SRESl: SRES2: 

■P(indi) => O (C V l)(indj) StaJTt =>{CVl) 

Qcmdi) => 0(-DV ~'0(ind4) start =»»(£> V -•1) 

(P A Q)(inds> 0(C V D){i„di} start =!> (C V D) 

where (indg) = 0((ind2), (ind 4 )) and, if (inde) G VAR then (inds) = (indg) but if 
(inde) is of the form (/(a)) then (inds) = (a). 

4.2 Temporal Resolution 

To apply the temporal resolution rule (defined below) we must first consider a 
loop. A loop is a situation when a Utercil, say I, occurs at all future moments 
on some or all paths. In the hnecir-time case an algorithm for identifying loops 
in sets of merged rules has been developed [6] and we expect to extend this 
technique to the case of CTL*. Merged rules are generated from step rules as 
follows. 

^(indi) ^ OC'(jnd2) 

Q(ind3) ^ 0-P(ind4) 

(i? A <3)(ind5> => (U A T>)(inde) 

where (indg) = 0((ind2), (indd)) and if (indg) € VAR then (inds) — (inde) else if 
(inde) is of the form f{a) then (inds) = («)• 

Definition 3 (Loop in CTL*). A loop in I is a set of merged rules 

< > 
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such that for all 0 < i < n, both \= Qi ^ I and \= Qi ^ \J Pj . 

j=o 

We will abbreviate such loop by (Pq V- • •VP„)(indj) O □^(indj) where, if for cill 
i, (indjj) S VAR then (indi) = (ind 2 ) = (ct) where (a) is a new index in VAR, but 
if for all i (indij) only involves one function symbol, say /, then (ind 2 ) = {f{oi)) 
and (indi) = (a) where (a) is a new index in VAR, else (ind 2 ) = {h(a)} and 
(indi) = (a) where (a) is a new index in VAR and h is a new function symbol in 
FUN. 

For this set of merged rules, each right hand side implies one or more left 
hand sides from the side condition on loops. As indices on the right hand sides 
are either functions or variables, and indices on the left hand sides are always 
variables, right hand side indices will always unify with the relevant left hand 
side indices. Each right hand side implies 1. Hence, once one of the left hand 
sides is satisfied a hteral I holds at all future moments on some or all paths 
(dependent on the type of the index). 

Now, once we have detected (Pq V • • • V Pn)(indi) O □^(indz) indicating a 
loop in I we can resolve it with a sometime rule Q^inda) ^“'^(indi) provided the 
indices (ind 2 ) and (ind 4 ) can be unified. 

TRES: ■P(indi) O Di^indi) 

<?<ind3) => ‘C^~^^(ind4) 

Q{ind5) ^ (“'^’VV->0(ind4) 

where 0((ind2), (ind 4 )) ^ 0 and (inds) = (ind 4 ) if (ind 4 ) G VAR else if (ind 4 ) is of 
the form f(a) then (inds) = (a). 

Note that in the special case of TRES, when ^-'l is labeled by (ind 4 ) G VAR, 
we preserve this label for the conclusion even when (ind 2 ) is (/(a)). This is related 
to the limit closure property of CTL* ([8]) and is required for completeness ([3]). 

Observe also that the conclusion of TRES should be further translated into 
SNFc*. 



4.3 Transferring Constraints 

If at any point we derive a formula such as (P A • • • A Q){indi> ^ Ofalse (.ndj) 
then we must ensure that P A • • • A Q never occms anywhere by applying the 
following rule. 

Transferral rule: (P A • • • A Q)<indi) =» O false (ind;) 

stemt (-iP V . . . -iQ) 
true (ind,) => O (-'P V • • • V ->(5)(indi) 



4.4 Termination 

The step and temporal resolution rules are repeatedly applied. If we reach a stage 
where no new resolvents are generated, then the procedure terminates. If either 
start => false or true^ind,) =>■ Ofalse ^indj) are generated dinring the temporal 
resolution procedure, it terminates and the original set of rules is unsatisfiable 
(see §5). 
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4.5 Resolution Example 

Now we consider a resolution refutation for A^(Op A EO~'p) given as a set of 
SNFc* rules obtained in §3.7. We begin the proof repeating this set of rules. 



1. 


start 


=>■ X 


SNFc* 


2. 


start 


^ -<zV y 


SNFc* 


3. 


true (c) 


=» 0(-'.z V j/)(c) 


SNFc* 


4. 


ym 


O-'PifiP)) 


SNFc* 


5. 




^ OP{<p) 


SNFc* 


6. 


X{a) 




SNFc* 


7. 




=> O false (/(^)) 


4,5, SRESl 


8. 


start 


=> -ly V -<z 


7, Transferral 


9. 


true 


^ 0(^y^ ~'Z) (13) 


7, Transferral 


10. 


true (p) 


=4* Q-'Zi^p) 


3,9, SRESl 


11. 


X(6) 


=> (false W z)(^s) 


6,10, TRES 


12. 


X(S) 


=> (zV (false A r))(s^ 


11, Removal of W 


13. 


X(S) 


=> 2(i) 


12, SIMP 


14. 


start 


^ -<x V z 


13, Temporising 


15. 


start 


=> false 


1,14,2,8, SRES2 



5 Correctness of the resolution procedure 

Our correctness argument is given by a number of theorems stating fundaimental 
properties of the above resolution procedure. 

Theorem 2 (Termination). Given any set R o/ SNFc* rules the resolution 
procedure applied to R terminates. 

Theorem 3 (Soundness of the resolution procedure). Given a set R of 
SNFc* rules if there is a resolution refutation for R (see %4-4) then R is unsat- 
isfiable. 

Theorem 4 (Completeness of the resolution procedure). If a set R of 
SNFc* rules is unsatisfiable then there exists a resolution refutation for R. 

While the proof of theorem 2, based on the finiteness of R, is relatively simple 
and the reader is referred to [3], proofs of theorems 3 and 4 are more complex 
requiring a number of technical definitions from the graph and automata theory. 
Due to the lack of the space we sketch outlines of these proofs, again referring 
to [3] for full details. 

First of all note, that within the proofs of the theorems 3 and 4, we utilise 
ideas from the alternating automata approach to temporal logic [13, 14, 2, 17] 
and essentially use the finite tree model property of CTL*. 

The following observation is important for the proofs. The structme of a set 
of SNF c* rules (see §3) can be related to that of a transition system. The initial 
rules provide the starting conditions while the step and eventuality rules can be 
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considered as “global” transition rules. Thus, for a set R of SNFc* rules obtained 
for some CTL* formula G we define an infinite alternating computation tree which 
is a type of labeled AND-OR tree. Subformulae of R are used as the labels for 
the states of such tree. Transitions in the AND-OR tree are hyper-transitions 
allowing for a state U to have a set of successors tj . . . tk called AND-successors 
of ti indicating the application of several rules simultaneously. Hyper-transitions 
are determined by a transition function defined similarly to one for the hesitant 
alternating tree automaton ([17]) and using the set of the global rules of R. 

Then we show that due to the finite model property of CTL* an infinite 
computation tree collapses into a finite graph. This graph corresponds to a set 
of possible runs of non-deterministic Buchi Tree Automaton [9]. 

A set of the standard deletion rules is applied to the graph: nodes with out- 
standing eventualities and nodes without successors must be deleted (as paths 
through these nodes cannot form part of any model of R). This pruning proce- 
dure terminates resulting in a Biichi Tree Automaton, Br, that accepts exactly 
those trees that satisfy the set R. This proves that a set R of SNFc* rules is 
unsatisfiable if and only if the automaton Br is empty. 

Finally, we show that the deletions in the graph of Br correspond to the 
resolution rules we have developed, thus, proving the fact that a set R of SNFc* 
rules is unsatisfiable if and only if there exists a resolution refutation for R. 

Therefore, taking into accoimt theorem 1, we conclude that given a CTL* 
formula G, then if there exists a resolution refutation for the set of SNFc* rules 
R generated from G then R (and hence G) is unsatisfiable. If the resolution for 
R terminates not finding a refutation it is possible to extract a model for R and 
thus R (and hence G) is satisfiable. 



6 Conclusions 

We have extended the clausal resolution method developed for linear-time tem- 
poral logics to a branching-time framework. This will form the basis of future 
work into both the efiicient implementation of this approach, where we expect to 
utilise techniques developed for implementing linear-time temporal resolution, 
and a detailed complexity analysis. In addition, we will also further examine the 
relationship between the different types of branching-time temporal logics by 
studying their translation into our normeil form. 

The authors would like to thank the anonymous referees for useful comments. 
This work was supported by EPSRC under research grant GR/L87491. 
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Abstract. Since Muller and Schupp have shown that monadic second- 
order logic is decidable for context-free graphs in [MS85], several spe- 
cialized procedures have been developed for related problems, mostly for 
sublogics like the modal p-calculus, or even its alternation-free fragment. 
This work shows the decidability of SlS, the trace version of MSOL, for the 
richer set of macro graphs. The generation mechanism of macro graphs 
is of higher-order nature and relates to the context-free one like macro 
grammars [Fis68] relate to context-free grammars. 

Technically, the result follows from the decidability of the emptiness 
problem of the trace language of a macro graph with fairness. The deci- 
sion procedure is given in form of a tableau system. Soundness and com- 
pleteness follow from the relation of the (finite) tableaux to their infinite 
unfoldings. This kind of proof promises to be helpful in the derivation of 
further results. 



1 Introduction 

During the eighties several modal logics like CTL, LTL, ctl* and the modal mu- 
calculus have been developed and shown to be decidable in finite structures. 
Model checkers have been implemented smd successfully applied to the verifica- 
tion or debugging of mostly finite systems, e.g. hardware or finite instances of 
protocols. Infinite-state system, arising from Petri nets or recursive programs, 
pose a different problem. There have been results for both kinds of systems. 
Muller and Schupp set a landmark by showing that the monadic second-order 
logic MSOL of context-free graphs is decidable, though with an inherently nonele- 
mentaxy complexity. What about specialized procedures for weaker logics, for 
instance the mu-calculus? 

Indeed, the nineties have seen several such results so far. Context-free pro- 
cesses^ are a subset of rooted context-free graphs. Burkart and Steffen were the 
first to give a decision procedure for the alternation-free mu-calculus in a rele- 
vant subset of the context-free processes [BS92], and extended it to the full set of 
guarded context-free processes in [BS94]. Both generalizations of the structure 
set and the logic were considered, and local as well as iterative (global) model 
checkers were developed. One may use a higher-order version generation pro- 
cess for structmes and decide the alternation-free mu-calculus [Hun94], or add 

The use of the term “context-free” is inconsistent in the literature. We will elaborate 

later on that. In this short overview, we use our terminology. 
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non-synchronized parallel composition like in BPP and check even nonregular 
properties [BEH95]. Or one may take directly the set of context-free graphs with 
alternation-free formulas [BQ96], or guarded context-free processes with the full 
mu-caJculus [Wal96,BS97]. 

The result in this paper is a step towards unification of these results. We show 
that w-regular properties, i.e. monadic second-order logic over one successor 
(sis) (the trace version of msol), can be decided for macro processes. Macro 
processes were already studied in [Hun94]. They take the idea of a context-free 
generation to the extreme by permitting (process) variables of arbitrary higher- 
order type. Ordinary variables appear as objects of type level one. As already 
known from the study of the call trees of programs with finitely typed procedures, 
increasing the type level enriches the set of definable structures. Additionally to 
considering a higher-typed recursion, the assumption of guardedness is dropped, 
thereby including infinitely branching processes into the picture. 

The decidabiUty of w-regular properties is derived as a consequence of the 
soundness and completeness of a tableau system deciding the emptiness problem 
of the trace language of a macro process under an additional fairness constraint. 
In our construction, the fairness constraint comes from the acceptance set of a 
Biichi automaton composed in parallel with the macro process at hand. 

The tableau system in this paper is a simplified version of one which not only 
deals with the emptiness problem, but which is also capable of treating arbitrary 
CTL specifications imder fairness constraints.^ The soundness and completeness 
proofs for the emptiness problem can be extendeded to the more general case of 
CTL. The techniques promise to be applicable to further problems in the area. In 
particular, one can hope to solve the decidability of msol in much the same way, 
extending the result of Muller and Schupp to macro processes. Also, a specialized 
procedme for the full mu-calculus should be obtainable. 

2 Processes and Specifications 

2.1 Kripke Structures 

Kripke structures and labeled transition systems are equivrilent ways of modeling 
reactive behavior. In this paper, we choose Kripke structures as our semantic 
domain, and we add fairness constrcunts. 

A Kripke structure is a quintuple {S,R,A,L,I) where 5 is a set of states, 
R C S X S is the transition relation, ,4 is a set of atoms, L : S —¥ V{A) is the 
labeling function, and 7 C 5 is the set of initial states. 

Fairness constraints are added as £in additional component C — {ci, ... , c„}, 
Ci C S. A path in a Kripke structure is an infinite sequence soi Si> • • • of states 
with (sj, Sj-i-i) E R . It is a fair path if for all Ci, infinitely many sj are in Cj. The 
reader may note that a Biichi automaton is a Kripke structure with finitely many 

^ We had to remove the ctl system from this paper due to space limitations. It was 
included in the submitted version which can be obtained from the author. 
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states and one fairness constraint. In this paper, we will consider only structiures 
with one fairness constraint. A state is called fair if it is in the fairness set. A 
trace of a Kripke structinre is the sequence of state labelings along a fair path of 
the structure. 



2.2 Macro Processes 



We use recxursive declarations over macro terms for our syntactic representation 
of infinite Kripke structinres. The nzune “macro” comes firom [Fis68], where it 
was used for a higher-order generalization of context-free grammars. A macro 
term has a type, which is either k, the basic type denoting Kripke structures, 
or it is a functional type p r. Each type has a level with £{k) = 0 and 
£{p t) — max(l + £{p),£{t)). The operator “-4” is assumed to associate 
to the right. « is the only type of level 0, the t}q)es of level 1 have the form 
K K K with at least two a . 

The set of macro terms over a set of atoms A is given by 

Ik D \ Ik + tfi 

£k—¥K A 

tr Pr I fp— VT ’ Ip 

where A is a finite subset of A and P-r is a, variable. When writing terms, • is 
assumed to associate to the left and to have higher priority than + . The formal 
semantics of macro terms without variables is a special case (empty declaration 
part) of the semantics of a macro process provided in Fig. 1. Intuitively, a macro 
term denotes a finite Kripke structure with an acyclic transition relation. Its 
states are the subterms of the form A • which aire called state terms. Such a 
term introduces a state labeled by A which is connected to the initial states of the 
Kripke structinre denoted by t^- D (deeidlock) stands for the empty structure, 
-|- is union. Functional application (•) models, on level one, ordinary sequential 
composition. Thus, A can be seen as a state with one “exit”, whereas a term 
of type K —> K K gives a Kripke structure with two exits, which are to be 
connected to the initial state sets of the argument structmes. 

As an example, the term 

{idle} • ({work} • D d- {idle} ■ ({work} • D))) 
denotes a Kripke structure as drawn below. 

— ► {idle} — ► {idle} — ► {work} 

I } 

To get structures with cycles and also infinite structures, we will need recursion. 
A recursive declaration for a set of typed variables Pi , . . . , P« is a set of equations 



Pi ■ Fi,i ■ . . . ■ Fi,m = <i(Pi,... ,Pn,Pi,l,--- ,Pi,nJ 



one for each Pj, where both sides of the equation are of type k with appropriate 
typed variables Pj^i, . . . , Pi,„,. 

A macro process is given by a recursive declaration for a set of typed variables 
Pi,... ,P„ and a main term tK{Pi,--- ,Pn)- The maximal type level of the 
variables gives the type level of the process. The Kripke structure denoted by 
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distn(D) 
distn{tii “i“ ^k) 

disto{Pi ■ Ui • . . 



=ar0 

=df ^ 



=df Un=0 

-dt 



distn{tK) U distniu^) 

■ Wnj ) =df 0 

di$tn+l {Pi ‘ Ul * . . . • Wnj ) =df distn {ti [Wl/ , . - . , ]) 

dist{tK) 
next{k ■ Ik) 

L{k ■ Ik) =df A 

statesoitK.) =df dist{tK.) 

stateSn+iitn) =df dist{next{stateSn{tK))) 

The Kripke structure of a macro process with main term to and occurring set of atoms 
A is given by: 

(U^o stateSn{tK), dist o next, L, A, dist{to)) 



Fig. 1. The semantics of a macro process 



a macro process is defined via two auxiliary functions dist and next, dist gives 
the set of state terms denoted by a term by extracting the summands which are 
already state terms and inserting the declaration of a variable P for summands 
of the form P-ti • ... • t„, substituting the actual parameters for the formals. 
next provides, for a state term, the term describing its set of successors. The 
states of the Kripke structure denoted by a macro process are those state terms 
which can be generated from the main term by repeated application of dist and 
next. The composition of next with dist gives the successor relation. The formal 
definition is given in Fig. 1. Fairness will be added to the structures by specifying 
sets of state labelings which are required to occur infinitely often. 

As an example of a macro process, consider the declaration 

= G-{R G) + {idle, ready} • ({work} • (E • G)) , 
with main term iZ {idle}. This macro process of level 2 defines the finite Kripke 
structure below. 

{idle} • {R ■ {idle}) {idle, ready} • ({work} • {R ■ {idle})) — ► {work} • {R • {idle}) 



Though this declaration of type level 2 yields a finite structure, this is not true 
in general. We will discuss the issue of the power of macro processes in the 
following. 

It is easy to see that we can write any finite Kripke structure as a macro 
process of type level 0. Just introduce one variable Pg for each state s, and write 
the equations Pg = L(s) • {Pg^ + . . . + P»„), where the Si are the successors of 
s. For the main term, we take the sum of the variables for the initial states. On 
the other hand, if the process is of level 0, i.e. all variables are of type k, the 
denoted Kripke structure is always finite. So type level 0 gives us exactly the 
finite structures. We call those processes regular. 

T3q)e level 1 provides us with more structures than those which were called 
context-free processes in [BS92], modulo the fact that we get Kripke structures 
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instead of labeled transition systems. Already the use of variables of type k -> k 
yields the recursion pattern available in [BS92]. But we do not have the restric- 
tion to guarded recmsion. This results in the possibility of infinite branching, as 
can be seen from the process with declaration Qk-^k ■ F = Q • {A - F) + F and 
main term B • {Q • D). This process denotes a Kripke structure with one initial 
state labeled by B, which has successors A“ • D for each n. 

Furthermore, besides infinite branching, permitting variables with more than 
one argument of type k offers the possibiUty to define the pushdown processes of 
[BS94]. This is most easily seen by resorting to the characterization of pushdown 
processes as being those structures which result from parallel composition of a 
finite process and a context-fi-ee one (in the sense of [BS92]). In fact, a simple 
proof shows that on each type level, macro processes are effectively closed under 
parallel composition with finite structures. The idea is to introduce, for each 
subterm of the declaration, one copy for each state s of the finite structure 
representing the original term with the finite structure being in the state s. 
Transitions in the finite system are represented by an appropriate change of 
copies. The transformation induces, for m states, m copies of each declared 
process variable, and in each declaration, m copies of each formal parameter 
variable. As the type level of a declaration does not depend on the number of 
parameters but only on their maximal level, this does not increase the type level 
of the process variables. 

Observation 1. Given a macro process of type level I and a regular process, the 
synchronous parallel composition of the two is bisimilar to a macro process of 
level I which can be computed effectively. 

In [BQ96], the problem of model checking regular graphs was considered. A graph 
is regular if it can be generated by a (context-free like) hyperedge replacement 
system. In fact, in the graph-grammar community they are often called context- 
free graphs, see [Hab92]. It is easily seen that for every regular graph there is a 
bisimilar macro process of level 1 (modulo the fact that we use vertex labeling 
instead of edge labeling which makes no essential difference). The same holds for 
the context-free graphs from [MS85]. So it can be said that with the exception 
of the combination of BPP and BPA from [BEH95], all considered domains are 
subsumed by macro processes of level 1. 

As indicated above, the attributes context-free, pushdown and regular are 
used inconsistently in the literatme. From our point of view, we would prefer 
to use the name context-free process for macro processes of type level 1. These 
include everything previously named context-free, pushdown or regular in this 
field. A subset potentially worthwhile to study separately are those processes 
where recmsion is guarded, resulting in structures with bounded degree. 

It should be noted that the type level of macro processes induces a strict hi- 
erarchy on the set of definable structmes. This can be derived from results about 
the call trees of programs with higher- type procedures [Dam79]. For instance, 
on type level 2 we can formulate a second-order stack (a stack of stacks), cf. 
[Hun94]. The traces of a second-order stack cannot be generated by a context- 
free production system (it is not an algebraic language in the sense of [Tho90]), 
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while this can be done for each macro process of level 1. In general, a stack of 
level n [KTU87] can be written as a process of type level n. Besides being able 
to define more processes in our higher-typed language, there is of comse also the 
benefit of being able to use higher types to better structure the definitions. Going 
beyond the domain of finitely-typed recursion would, however, be problematic. 
Processes defined by general, untyped recmsion have an undecidable emptiness 
problem. 



2.3 Specifications 

As we deal with the emptiness problem, the only specification of a Kripke struc- 
ture is £, saying that there is no fair path. Intermediate assertions in a tableau 
may concern higher-order subterms of macro terms. For those, we introduce 
higher-order emptiness specifications. 

fp-*T Vp Vti Vp 0 I Vp^Vp I (^) ^p I ‘Pp 

A higher-order specification contains a list of assumptions about the 

argument This list may be empty (0), or may contain one or more assump- 
tions, which may be guarded. The guard, (f), means that this assumption may 
be used after a fair state has been encountered, while this is not permitted with 
unguarded assumptions. 

2.4 Sequents 

Assertions in the tableau system are sequents of the general form 

H : tr rj-r . 

H, if it is not empty, is a list of assumptions about the formal parameters which 
may appear in tr- Elements of the list Ene sequents of the form Ft t]t- 

In the tableau system which is presented in the next section, an assumption 
not guarded by (f ) is replaced by 0 when a fair state is encountered explicitly 
(when reasoning moves from A • to if A labels a fair state) or implicitly (in 
the first argument rule, where the presence of (f ) implies that inside the process 
body before the argument was invoked a feiir state might have been encountered). 

We formulate the maintenance mechanism of fairness guards in an auxiliary 
function rem. 

rem{ipr) =df 0 rem{{f) (v?r)) =df i^) Pr 

rem(0) =df 0 rem{r]i,r] 2 ) =df rem{rii ) , rem{r] 2 ) 

rem{F rj) =df F h rem{rj) rem{Hi,H 2 ) =df rem{Hi ) , rem(i? 2 ) 

We say that a list of assumptions contains a sequent, Ft ^Pr & H, if there is 
a sequent Ft \- t}t no. H and either or (f) tpT appears in the list t)t- 

3 The Tableau System 

The rules of the system are given in Fig. 2. In general, the schematic rules have 
the form that one sequent is written above the line, and perhaps more than one 
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Plus 



H ti + <2 ^ e 



H : ti h e , i/ : t2 H e 

Declared call with hypotheses or parameters 

H : P • ti • ... ' I- e 



P rii rjn ^ i H : ti h t]i , ... , H : tn ^ Vn 

Pure declared call 

P h r/i »7n ^ e 



Fi I- 171 Fn h Tjn : tp e 

Formal call 

H : F • ti • ... • tn H v> 



F- Fi • • Fn = tp is the declaration of P 



H : ti h r?i , ... , if : fn H »/n 
Pure constant call H : Ah rj -¥ e 



F-r h T)i T}n ^ ^ H 



Constant call with parameter 

H : k-th £ 



H : t\- e 

Argument rules 



A is not a fair labeling 

H : t\- 
rem(H) : t h 



F« h T7 ; A • Fk h £ 

H : A • f h e 



rem{H) : t h £ 

H : t 7)1, 7)2 



A is a fair labeling 



Hi t \- 7)1 , H l t 7)2 



Fig. 2. The rules for tableau construction 



below the line. An instance of a rule schema is built by performing appropriate 
substitutions for the metavariables for specifications and terms. 

A tableau is a finite tree built from instances of these rules, starting with 

h e where is the main term of the macro process to be studied. The 
intuition about the tableau system is the following. To show that a process has 
no fair path, we follow all of its paths. Part of the rules formalize the stepwise 
computation rules for state generation. So, when a declared process variable is 
encountered, we expand it according to its definition (in the rule pure declared 
call). To avoid having to consider infinitely many terms, we do not substitute 
the actual parameters for the formats. Instecid, we “guess” the specifications of 
the arguments we will need and add them as hypotheses. The hypotheses are 
apphed using the rule formal call, and they are verified in separate branches of 
the tableau starting below the instance of the pure declared call rule. 

It is very important (for completeness) that there can be only finitely many 
relevant argument properties. The reason for this is that there are only finitely 
many nonequivalent specifications on each type level. As a consequence, we can 
expect sequents to repeat which will allow us to produce a finite tableau for those 
processes having an empty set of fair paths. The top sequents of the instances 
of the rule pure declared call will be the ones we recm to. Which tableaux count 
as proofs is captured in the definition of a successful tableau. 

A path in a tableau is unfair if it contains neither a node of the form H : 
k -t \- e with a fair labeling A nor a node H : tr H (f) A leaf of the form 
Pj. h (fr is recursive if there is a predecessor with the same sequent and the 
connecting path is an unfair one. A leaf is successful if it is recursive, or of the 
form H : t^l-0orif: D \~ £ . 

Definition 1. A tableau is successful if all its leaves are successful. 
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Plus + t 2 H £ 

h e , ta h e 

Process Cali P • ti I" e p p 

, tn/^^n] I- e ^ 

Constant call A • t I- e 

t 1- e 



■ Fft = tp is the declaration of P 



Fig. 3. The rules for the construction of proof trees 



Making recursive leaves successful relies on the reasoning that we could go on 
extending the tableau in always the saune way without ever encountering a fair 
state. Thus, the path we follow in the Kripke structure (if any — there might not 
be any state term at all on the path in the tableau) is not a fair one. Exactly 
this argument is used in the soundness proof below. 

Before we come to that point, a few remarks on using a tableau system are in 
order. Proving the absence of a fair path in a macro process could be formulated 
as an iterative algorithm, and indeed one such algorithm can be derived from 
the completeness proof. We chose to use a tableau system because other decision 
procedures, for instance the one for fair CTL, can be formulated more easily that 
way. And in the same way as the emptiness system generaUzes to the one for 
fair CTL, so do the proofs of its soundness and completeness. 

4 Soundness and Completeness 

The general idea of the soundness proof is to show that any successful tableau 
represents a potentially infinite successful proof tree. Completeness follows from 
the existence of a successful proof tree and that each such tree can be folded 
into a successful (finite) tableau. 

Different from tableaux, proof trees do not contain higher-order assertions. 
They compute on the level of Kripke terms, exploring the structure by applying 
the fimctions dist and next. There is in fact just one proof tree for emptiness. 
This does not hold for other problems where the general proof technique is also 
applicable. The proof tree for emptiness is built using the rules from Fig. 3, 
starting with the same sequent as the tableau. To be successful, the only per- 
mitted leaf is D h £, and the tree must not contain a path with infinitely many 
fair states. 

Lemma 1. The proof tree of a macro process is successful iff its set affair paths 
is empty. 

Soundness of the tableau system follows from: 

Proposition 1. Every successful tableau unfolds into a successful proof tree. 

Proof. (Sketch) To unfold a tableau, we follow its paths, building the proof tree 
by applying the corresponding proof tree rule as we proceed. When we encounter 
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a process call, we remember the argument tableaux in an auxiliary environment, 
to be used when a formal parameter is Ccdled. At a recursive leaf, we return to 
the node it recms to, usually with an updated environment. That way, a proof 
tree is generated. 

Infinite paths can only be generated by returning from recursive leaves to 
their predecessors. But the success restriction on recursive leaves guarantees 
that on the expanded path in the proof tree, no fair state will be encountered 
between an occurrence of the predecessor and the recursive leaf. 

Completeness is a bit more compUcated. 

Proposition 2. Every successful proof tree can be folded into a successful tableau, 

Proof. First, we built a possibly infinite tableau corresponding to the proof tree. 
To do that, we have to extract higher-order assertions and argument assumptions 
when we encounter a process call. If P • • ... • h e is a node in the proof 

tree, we take, for each U, the set of paths in the tree which lead to a sequent 
ti- ... h e. Each such node gives a higher-order assertion about U (of lower type 
level), and the state terms on the connecting path determine whether the fairness 
guard is added or not. We collect these cirgiunent specifications into a list rji. 
Remember that there are only finitely many specifications on each type level, so 
this is always a finite list. We thus get higher-order assertions to be introduced 
by the rule declared call with hypotheses or parameters. The part of the tableau 
dealing with the process body is generated from the proof tree starting at that 
node, the argument tableaux are extracted from the respective subtrees. 

So we get a possibly infinite tableau, and have to convince us that it contains 
a successful initial subtree, i.e. that we can cut each infinite path at a successful 
recurrence. Since there are only finitely many fair states in the proof tree, outside 
of an initial segment, no fair state does occur. Thus, any repeating higher-order 
assertion about a declcired process variable outside of that initial tree will satisfy 
the criterion of a successful recursion. And repetitions are bound to occur on 
each path since there are only finitely many different sequents in the (infinite) 
tableau. 

Full proofs of such (un) folding arguments for related tableau systems are given 
in [Hun98]. Combining the two propositions, we get our main result. 

Theorem 1. The tableau system is sound and complete. 

Thus, we can construct a finite proof if the set of fair paths of a macro process is 
empty. To show that the set is not empty, one can enumerate sufficiently many 
tableaux to show that no successful one does exist. It sufiices to check all tableaux 
of a certain maximal depth, for the depth can be bounded by observing that all 
terms occurring in assertions are subterms of the process declaration and the 
number of (higher-order) specifications on each type level is finite. This boxmd 
is n-exponential for type level n. 

An alternative proof could use a dual tableau system establishing the exis- 
tence of fair paths. In that system, only one branch of a plus is examined, and its 
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fadrness guaxd requires rather than permits encountering a fair state before using 
a guarded assumption. Recursions then cire successful if a fair state is guaranteed 
to have been encountered on the path to the reclusive leaf. 

Remembering that a Biichi automaton is a finite Kripke structure with one 
fairness constraint, and that macro processes are closed under parallel composi- 
tion with finite structures (Observation 1), we can conclude the decidability of 
w-regular properties for macro processes. These are definable by Buchi automata 
[Tho90]. We build the product of the complemented Biichi automaton and our 
macro process and apply the tableaux-based decision method. 

Corollary 1. slS is decidable for the set of traces of a macro processes. 



5 Conclusion 

We have shown that sis is decidable for macro processes. A context-free graph 
from [MS85] is (modulo the duality between Kripke structures and edge-labeled 
graphs) bisimilar to a macro process of level one, while higher type levels gen- 
erate still more structures, and we do not require a bounded degree of vertices. 
Thus we have generalized the decidability result from [MS85] to a richer set of 
graphs, but only for the logic Sls. We conjecture, however, that oiu proof tech- 
nique will enable us to derive decidabihty of an adequate version of msol over 
macro processes, which would result in a true generahzation. All that would be 
needed is an automaton characterization of MSOL over unordered trees, suppos- 
edly something like Rabin automata, aind an appropriate genereilization of the 
fairness mechanism to the richer form of constreiints. Also, we would expect that 
one could develop a similar tableau system for the full mu-calculus (which is a 
sublogic of msol). We have already done an extension to a system dealing with 
fair CTL. Tableau systems as ours can of course be turned into iterative, either 
local or global model-checking procedures. Summarizing, the proof techniques 
we have appUed here seem to be able to cover edl of the existing results, with 
the exception of [BEH95]. 

[MS85] contains the conjecture that for a graph to have a decidable monadic 
second-order theory, it must result from a context-free graph F by extending F, 
at certain computable points, by one of a finite set of context-free graphs T). 
This characterizes a generation process which is of second-order, aud, bearing in 
mind the strictness of the hierarchy induced by the type level, the higher-order 
result fi-om this paper invalidates this conjecture. 

On the other hand, we would not expect to be able to characterize the set of 
graphs with a decidable monadic second-order theory, nor the set of properties 
decidable for macro processes. We observe that already in [BEH95] a differ- 
ent extension of the set of processes has been handled (unsynchronized parallel 
composition), and also nonregulcir properties were verified, though both the logic 
and the process language are difficult to relate to other setups. Furthermore, the 
patterns of macro process definitions stay in the domain of primitive recursion, 
and we would not suppose decidability to be restricted to that. Stating this 
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positively, we expect to cover the range of regulcir properties in finitely-typed 

reciursive systems, cind we do believe there is still more to cichieve. 

Besides the study of reactive systems, a field of applications of the method 

could be found in the domain of data-fiow ancilysis of recmsive programs. 
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Abstract. We consider the multiparty communication model defined 
in [DF89] using the formalism from [Hr97]. First, we correct an inaccu- 
racy in the proof of the fundamental result of [DR95] providing a lower 
bound on the nondeterministic communication complexity of a function. 
Then we construct several very hard functions, i.e., functions such that 
those as well as their complements have the worst possible nondetermin- 
istic communication complexity. The problem to find a particular very 
hard function was proposed in [Du99], where it has been shown that al- 
most all functions are very hard. We also prove that combining two very 
hard functions by the boolean operation xor gives a very hard function. 



Introduction 

The multiparty model is a natural extension of the two-party model that has been 
extensively studied. The two-party communication model assumes that each of 
two parties has a part of the input, and the aim is to compute a given boolean 
function on the input with the minimal amount of communication, i.e., minimiz- 
ing the communication complexity. The study of the communication complexity 
of the two-party model was inspired by the VLSI circuits complexity. There are 
many other applications of the communication complexity. An overview of ap- 
plications can be found in [KN97], and an overview of results on the two-party 
model in [Hr97]. 

In the multiparty model the input is distributed among n processors and the 
goal is the same: to compute a function on the whole input with the minimal 
total amount of communication. It is assmned that there is a coordinator that 
is allowed to communicate to each party, but the parties are not allowed to 
communicate directly with each other. The multiparty model was introduced 
and investigated in [DF89]. 

Note that in [CFL83] a different multiparty model was considered. In that 
model each of the n parties has all the inputs except one, and all parties commu- 
nicate through a shared “blackboard”. This model was investigated in [BNS89], 
where an interesting relation to time-space tradeoffs and branching programs 
was discovered. We do not know any connections between these two models. 

* Supported by Academy of Finland under Grant No. 44087. 
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There axe only few results known about the multiparty model introduced 
in [DF89]. In [DF92] an upper bound on the deterministic communication com- 
plexity of order 0(k(/) k^(l — f) ncc(/) ncc(l — /)) was established, where k(/) 
is the number of processors accessed and ncc(/) is the nondeterministic commu- 
nication complexity of the function /. In [DR95] a fundamental tool to prove a 
lower bound on the communication complexity of a given boolean function was 
developed. However, there was a small mistake in the proof. In this paper we give 
a correct proof of this result. Further results proved in [DR95] are the following 
relations between the nondeterministic communication complexities of a func- 
tion and its complement and between the deterministic and nondeterministic 
communication complexities: 

ncc(l - /) < n ( 1 + ) , cc(/) < n ( 1 + 2"“(/) ) . 

For the restricted model, where only one communication is allowed for each 
direction on each link, the following results were established in [DR95]: 

ncci(/) = ncc(/) , cci(/) < cc(/) • . 

For all these bounds it is shown that they are optimal. 

The main result of this paper is the construction of several very hard func- 
tions. A very hard function is a function such that it and its complement have 
the worst nondeterministic communication complexity. In the deterministic case, 
of course, the communication complexities of a function / and its complement 
1 — / are equal. In the nondeterministic case there could be even an exponential 
difference between the communication complexities of / and 1 — cf. [DR95]. 
However, in some sense, functions / and 1 — / represent the same function, so 
it is better to switch to the complementary function, if that dramatically lowers 
the nondeterministic communication complexity. The existence of a very hard 
function shows that we cannot always use this approach to decreeise the non- 
deterministic communication complexity. The problem to find a particular very 
hard function was proposed in [Du99], where it has been shown that almost all 
functions are very hard. 

Moreover we prove that combining two very hard functions by the boolean 
operation xor gives also a very hard function. This result extends a similar 
result from [DR95] that claims that the boolean operation and preserves the 
hard functions and the result from [Du99] that claims that the deterministic 
communication complexity adds up when combining two functions by xor. 

The full version of this paper appears in [Ma99]. 

1 Preliminaries 

In this section we define the multiparty model, the nondeterministic protocols 
and the communication complexity. We start with an informal definition of the 
model. 

The multiparty model consists of a coordinator and n parties. The coor- 
dinator wishes to evaluate a boolean fimction / (xi, . . . ,x„). The input vector 




162 



Jan Manuch 



X = (xi, . . . ,x„) is distributed among parties, with Xj £ {0, 1}"* known to the 
party i. We allow a communication only between the coordinator and any party. 
Instead of saying “the communication between the coordinator and party i” we 
often say “the communication on hnk i”. The computation consists of several 
phases, where one phase is as follows: 

The coordinator sends some nonempty messages to some parties and then, each 
party that got a message, sends a nonempty message back to the coordinator. 
After the computation the coordinator announces the result: 1 if accepted or 0 
if rejected. In the nondeterministic case we accept an input, if there exists an 
accepting computation for this input. 

Next, we give formal definitions of the nondeterministic protocol. We will use 
the notation from [Hr97]. Let A be the empty string. 

Definition 1. A nondeterministic protocol overX = {xi, . . . ,x„} withxi = 
x| . . . X™ 6 {0, 1}'", n > 1, m > 2, is a pair P - {^c, ^p), where 

(a) 0c is a communication relation of the coordinator in 

[{0,l,$}Tx([{0,l}TU{0,I}); 

and the projections of 0c defined as follows: let c £ [{0, 1,$}*]", then 

0cic)n[{o,i}r = {^c,i{c),...,^cAc)); 

(b) 0p is a communication relation of parties in 

({1, . . . , n} X {0, ir X {0, 1, $}+) X {0, 1}+, 



where 

(i) 0c has the prefix freeness property: 

for each i £ {!,..., n}, c, c' £ [{0,1,$}*]", if Ci — c[, where Ci is the i-th 
component of c and d,d' £ {0, 1}"*" such that (c, d),(c',d') £ 0c, i> then d is 
not a proper prefix ofd'; ifci is a proper prefix ofc\ and d £ ^c,i(c)n{0, 1}"*", 
then Cid is not a proper prefix of (each party knows when it is the end of 
a message from the coordinator), 

(ii) 0p has the prefix freeness property: 

for each i £ {l,...,n}, c£ {0,1,$}*, d,d’ £ {0, 1}'*' and Xi,x'i £ {0,1}”* 
such that {{i,Xi,c),d),{{i,x[,c),d') £ 0p d is not a proper prefix of d' (the 
coordinator knows when it is the end of a message from any party), 

(Hi) for each i £ {!,..., n}, c,d £ [{0,1,$}*]" such that Ci — c'^ ^ A, if 
0c{c) n {0,1} 0, then 0c{d) Q {0,1} or 0c,i{c') = {A} (if there is a 

communication on link i then party i knows when it is finished). 

Let us give an intuitive explanation of the symbols used in the above defini- 
tion. 0, 1 represent bits sent through the communication links and $ is a virtual 
end mark of messages. Virtual means that the symbol is not send neither by the 
coordinator, nor a party. The properties (i)-(iii) ensmre that a virtual end mark 
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is not necessary. In fact, the use of such a mark during the communication can 
lead to very nonintuitive properties of the model: Consider, for example, a node 
which sends k empty messages and in this way the receiver obtains an arbitrary 
large information, the number k, with a minimal increase of the commimication 
complexity. Symbols 0, 1 represent the result value of a computation. 

The relation maps a temporary state of a communication on all links 
to the one of the result values or an n-ary sequences of messages sent from the 
coordinator to the each party. The empty message means that the coordinator 
does not communicate with the party. The relation maps the number i of a 
party, the input of the party i and a temporary state of the communication on 
the link i to a nonempty message sent from the party i back to the coordinator. 
Since we consider nondeterministic protocols, these maps could be ambiguous. 
We proceed with the definition of the computation. 

Definition 2. A computation of P on an input vector x = (ri, . . . ,r„) is a 
communication vector c— (ci, . . . , c„) with C{ = cl$c?$ . . . $c?’’’“^$c?’'‘$, where 

(i) for all i € {!,... ,n} cl,...,c?’’‘ € {0,1}+; 

(a) we can find an r and a sequence of vectors C[q], . . . , C[ 2 r] G [{0, 1, $}*]"; *-e., 
states of the computation such that 

(a) C[o] = (A,..., A), 

(b) if I is even, then 

C[j+i] € {(c[j]idi$,...,C[/]„d„$) I (di,...,d„) € ^c(c[i])}, 

(c) if I is odd, then for each i e {1, . . . , n} 

C[j+i]i = A , if C[/]i = $, 

C[/+i]i = c'$ , if C[i]i - c'$$, 

C[/+i]i € {c[j]jd$ 1 d €^p{i,Xi,C[i]i)} , otherwise, 

( d) C[ 2 r] = c; 

(Hi) «?c(c)C{0,l}. 

We denote the set of all computations on an input x (resp. all computations) 
by comp(P, x) (resp. comp{P)). We say that a computation c is accepting, if 
I € ^c(c). The smallest r is called the number of rounds of c. P is called an 
r-round nondeterministic protocol if every computation of P has at most r 
rounds. 

We say that P computes 1 for an input vector x, i.e., P{x) = 1, if there 
exists an accepting computation c of P on x, otherwise P computes 0. We say 
that P computes the boolean function f with the input variables X, if for each 
X e [{0, 1}"*]” we have /(x) = P{x). 

Now, we illustrate the above definitions with the following example. 
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Example 1. Consider the function / defined by 



f {xi,..., Xn) = 1 iff there exist i j such that Xi — xj. 

We construct a nondeterministic protocol P = {^C: ^p) computing this function: 






hi — bj e {0, 1} and Vfe ^ i,j : 

if xj — b, 
otherwise, 



bk = A}, 



(ci,...,c„) 



1, if Cj = Cj = b$y$, 6 6 {0, 1}, 2/ £ {0, 1}™“^ 
< and Vfc ^ i,j: Ck = A, 

0, otherwise. 

V ’ 



has the prefix freeness property, since all messages the coordinator sends, 
have the same length. The same holds for If we have c,c' G [{0,1,$}*]" 
with Ci = c'i ^ X and ^c(c) fl {0, 1} 7^ 0, then ^(c') C {0, 1}, so P satisfies also 
condition (Hi) of Definition 1. Hence it is a nondeterministic protocol. 

Informally we can describe the protocol P as follows: The coordinator guesses 
for which i 7^ j the equality Xi = Xj holds. It also guesses the first bit of Xi = xj 
and sends it to the parties i and j. It does not communicate with the other 
parties at all. The party i (resp. j) checks if the coordinator guessed the first 
bit of Xi (resp. Xj) correctly and in such case sends back the rest of the input Xi 
(resp. Xj). Finally, the coordinator checks if the rest of Xi equals to the rest of 
Xj. Only in such case it ends with I. Clearly, this protocol computes the function 
/■ 

Consider an input vector x = (xi, . . . , x„) such that xi = X 2 — ai . . . Um ^3 
and oi = Xg. Then c = (ai$a2 . . . UmSj ai$U2 • • • A, . . . , A) is an accepting 
computation of P on x. Indeed, we have ri = 7-2 = 1, rs = • • • = r„ = 0 and the 
sequence of states of the computation is: 



C[o] — (A, . . • , A), C[i] — (fli$, 5 ^)) 

C^2] — (flX^^2 ... U„j$, Ux^®2 • ’ ' A, . . . , A) — C . 

The number of rounds of c is 1. The computation c' = (ai$a2 . . . «„$, A, ai$Xg . . . 
x^$,A, ...,A) is not an accepting computation. Obviously, each computation 
uses exactly 1 round, so P is a 1-roimd protocol. 

We finish this section by the definition of the nondeterministic communica- 
tion complexity. 

Definition 3. The length of a computation c of P is the total length of all 
messages m G {0, 1}"*" in c. For each x G [{0, 1}"*]" such that P(x) = 1, let 
ncc(P, x) denote the length of the shortest accepting computation of P onx. The 
nondeterministic communication complexities of the protocol P and of 
the function f are 



ncc(P) = max{ncc(P, x) j P(x) = 1}, ncc(/) = min{ncc(P) | P computes /}. 
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Similarly, we can define ncc{P,x/J), ncc{P/J) and ncc(f/J), when we measure 
only the communication on links in J. 

Example 1 (continued). The length of both computations c and c' is 2m. Clearly 
this holds for any accepting computation of P. Hence ncc(P) = 2m, , which 
implies ncc(/) < 2m. 

Using the idea how the protocol P works, we can construct for any boolean 
function / a protocol with the nondeterministic communication complexity nm, 
which, first, sends all the information the parties have to the coordinator and 
then the coordinator is able to compute /. This implies: 

Proposition 1. For any boolean function f : ({0, 1}™]" — > {0, 1}, ncc(/) < nm. 

Now we can define hard and very hard functions. 

Definition 4. A boolean function is hard, ifncc{f) — nm and it is very hard, 
ifncc{f) = ncc(l - /) = nm. 



2 The Fundamental Tool 

In this section we correct Theorem 1 and its proof from [DR95]. The mistake 
in [DR95] occurs actually in Lemma 1. Here is the correct version of Lemma 1. 

Lemma 1. Let 6i, 62 , . . . , 6p be a sequence of nonempty binary strings such that 
no element is a proper prefix of another and no element of this sequence occurs 
in it more than q times. Then J]i=i l^tl ^ plog | ■ 

Before the proof let us show that Lemma 1 in [DR95] does not hold. 

Example 2. In Lemma 1 and its proof of [DR95] the inequalities XIi"=i l^»l — 
Plog [1] and l^il ^ P [log |] axe claimed, and the later one is used in 

the proof of Theorem 1 in [DR95]. However, the following sequence of binary 
strings 00, 00, 00, 01, 1, 1, 1 shows that neither of these is true. Indeed, we have 
Sf=i \^i\ = 11, P = 7 and if we choose 9 = 3, then plog = 11,09 and 

P [log £ j = 14. 

Proof (Proof of Lemma 1). Let 9' < 9 be the maximal number of occurrences 
of a string in the sequence. For i = 1, . . . , 9' consider sets Qi containing strings 
which occur in the sequence at least i times. Clearly, \Qi\ — p. For each 

set Qi there is a corresponding binary tree such that the strings in Qi encodes 
the paths of the tree. Since no element in the sequence is a proper prefix of 
another element, all strings end in the leaves of the tree. The trees Qi,..., Qq’ 
form together the forest G. Let T{G) denote the total sum of the depths of all 
leaves of the forest G. Obviously, T{G) = \bi\. Now we transform the forest 

G into G' in two steps: 
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Step (i).li some node in G has only one son, then we delete the node and replace 
it by its son. In this way we get a binmy tree. 

Step (a). Let v-i (resp. W2) be a leaf with the maximal (resp. minimal) depth. If 
depth(t;i) < depth(n2) + 1, then the forest is balanced. Otherwise, we perform 
the following operation: if v is the father of v\, then we cut off both sons of v 
and connect them to V 2 - In this way we get the balanced tree G' with depths of 
leaves either h or h + 1. 

Note that in both steps we preserve the numbers of both leaves and trees, 
and the depths of leaves can only decrease, thus T{G') < T{G). Let a (resp. b) 
be the number of leaves in depth h (resp. h + 1). Clearly, o + b = p. It is easy to 
derive q' ■ == 2a + 6 = p + a, which imphes 

T{G) > T{G') = ha + (h + 1)6 = plog - a > plog 

q' q' 

where the last inequahty comes from the inequahty log(a; + 1) > a; for x G (0, 1), 
in which we substitute x = -. □ 

p 

To state the theorem we need another two definitions. 



Definition 5. [DR95] Let Y be a nonempty subset of f ^(1). We say that the 
index j is important for f with respect to Y, if for every y = {yi, ■ . . ,yn) & Y 
there is y’^ € {O,!}"* such that /(j/i, . . . . . . ,y„) 7^ f{y). 

Let J C {1, . . . , n} be a nonempty set of indices and x = (xi, . . . , x„), y = 
(j/i) • • • ) 3/n) two input vectors. We denote 



[x:y]/J = (21 , . . . , z „) , where Zi - 



Xi, ifisJ, 
yi, otherwise. 



Let M be a nonempty sets of inputs. Then J -closure of M is the set 
C£j{M) = {{[x:y]/J \ x,ySM}. 

If we use Lemma 1 instead of incorrect Lemma 1 in [DR95] in the proof of 
the theorem, we get the following result. 

Theorem 1. Let Y be a nonempty subset of /”^(1) and Ji,...,Jr be pairwise 
disjoint sets of indices. Assume that every j € Ui_i Ji is a important index for 
f with respect to Y and let di be integers satisfying 

di> m^{lMl I C£j,{M)cr^l)}. 



Then we have lower bounds on the nondeterministic communication complexity 



ncc(f) > 

t=i 



di ■ 



ncc(// Ji) > log Jp, 
di 
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The proof is based on Lemma 1, properties of nondeterministic protocols 
and the classical crossing sequence argument and we refer interested readers 
to [DR95]. Note that the above theorem diflfers from Theorem 1 in [DR95], where 
it was claimed ncc(/) > [^^g We apply this theorem in Example 1. 

Notation. For every a e {0, 1}"* and every integer d let val(a) denote the 
value of the binary string a and let {a)d denote the string from {0, 1}”* such that 
val((o)<i) = val(o) + d mod 2'". 

Example 1 (continued). Assume that n < 2*"“^ + 1. Let us choose in Theorem 1: 
r = 2, = {1}, J 2 = {2} and Y = {(a, a, (0)1, (0)2, . • . , (a)„_2) | a 6 

{0, 1}"*}- It is easy to see that indices 1 and 2 axe important with respect to 
Y. Take arbitrary disjoint a, 6 € {0,1}"*. Assume that /(a, 6, (6)1, (6)2 , . . . ) = 
fib, a, (a)i, (0)2 , . . . ) = 1. Then we have a = (fc)c and b = (a)d for some integers 
1 < c,d < n — 2 < 2”*“^ — 1. Obviously c-\- d = 2"*, which contradicts the 
above restrictions for c and d. Hence we can set di = 1 cind similarly ^2 = 1- 
Using Theorem 1 we obtain ncc(/) > 21og(2"*) = 2m. Thus we can conclude 
ncc(/) — 2m for all n < 2”*~^ + 1. 



3 Very Hard Functions 

In this section we give the main result of the paper. It can be shown using 
Theorem 1 that the function g{xi,X 2 ) = 1 iff val(a;i) < val(x2) is a very hard 
function. We construct several very hard functions for arbitrciry n using the 
function g. We need some more definitions. 

Definition 6. Let f (xi, . . . ,x„) be a boolean function, Xi 6 {0,1}"* and let 
Ji = {i} for all i G {1, . . . , n}. If there exists a nonempty set Y C such 

that for all i G {1, . . . ,n} the index i is important with respect to Y and there 
exist integers d{ such that for every i G {1, . . . , n} 

di > m^{|M| I Cej, (M) C /-I (1)}, and (1) 

MQY 

log ^ = m , (2) 

di 

then the function f is strongly hard. We will say that the function f is strongly 
very hard when both the function f and its complement 1 — / are strongly hard. 

According to Theorem 1 it is obvious that any strongly (very) hard function 
is also a (very) hard function. In what follows we take a strongly very hard 
function f(xi,X 2 ) and using the operation xor on boolean strings we construct a 
function g (xi, . . . , Xn)- We show that the function g is very hard. But let us first 
define the operation xor on boolean strings and the xor constructor F® which 
constructs the function g. 
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Definition 7. Let 0 denote the boolean operation xor. Let x — .. . x"*, y = 

be m-bits long boolean strings. We define the xor operation on strings 
as follows: x®y — x^ ®y^ ... x"' 0 j/"*. Let n>2 be an integer, J C {1, . . . , n} 
be a nonempty set and f{xi,X 2 ) be a boolean function. We define the xor con- 
structor F® as F^{J,n, f){x\,. .. ,Xn) — f{ ® Xi, 0 Xj). 

i€{l n}-J ) 

The proof of the following theorem can be found in [Ma99]. 

Theorem 2. Let /(xi,X 2 ) be a strongly very hard function, n be an integer and 
J C {1, ...,n} be a nonempty set. Then the function g — F®{J,n,f) is very 
hard. 

Theorem 2 gives us the method how to construct a very hard function for any 
n. But first we have to find a strongly very hard function /(xi,X 2 ). We improve 
the function g to be strongly very hard. Consider the function g' defined by 

g'(xi,X 2 ) = 1 iff val(xi) < val(x 2 ) A (xi,X 2 ) ^ (O'”,!"*) V (xi,X 2 ) = (l^.O™) . 

Theorem 3. The function g' is strongly very hard. 

Proof. First, consider the function g' . Set Ji = {1}, J 2 = {2}, Y = {(xi,X 2 ) | 
xi = X 2 }. Take any vector x = (a, a) G Y. We have 5 '((a)+i,a) = g'{a, (a)_i) = 
0, hence both indices are important with respect to Y. Consider now a ^ b. 
Clearly, exactly one of the values g'{a,b) and g'{b,a) is equal to 0. Hence the 
integers di = d 2 = 1 satisfy the condition (1) of Definition 6 and since |T| = 2"*, 
they satisfy also (2). So the function g' is strongly very hard. 

For the function 1 — 5 ' we set Ji = { 1 }, J 2 = {2}, Y = {((a)+i,a) | 
a € {0,1}”*}. Since, for all a € {0, 1}”*, (1 — g'){a,a) = 0, both indices 1,2 
are important with respect to Y. Take any vectors ((o)+i, a), (( 6 )+i, 6 ) G Y 
where a ^ b. We want to show that at least one of the values (1 — 5 ')((o)+i, b), 
( 1 — p')((b)+i, a) is equal to 0. Assume the contrary that both the values are equal 
to 1. Then we have (a)+i > b and either ((o)+i,b) ^ (l'",0"*) or ((a)+i,b) = 
(Qm, second case both a,b eure equal to 1 "*, which is a contradiction. 

In the first case we obtain 2"* — 1 > val(a) + 1 > val(b) > 0 and similarly 
2 "» — 1 > val(b) + 1 > val(a) > 0. These together imply the inequality val(a) + 1 > 
val(b) > val(a) - 1. Thus again a — b, a contradiction. We have shown that at 
least one of values {l—g'){{a)+i,b), (1— 5 ')((b)_f.i, a) is equal to 0. The cardinaUty 
of the set Y is 2"*. Therefore as in the previous case the integers di ^2 = 1 
satisfy both conditions (1) and (2). Hence the function 1 - 5 ' is strongly hard 
and the function p' is strongly very hard. □ 

As an immediate consequence of Theorems 2 and 3 we have 

Corollary 1. Let n>2 be an integer and J be a nonempty subset o/{l, . . . , n}. 
The function F®{J, n, g') is very hard. 

Next, we analyze the functions created by combining two boolean functions 
by the boolean operation xor. More precisely: 
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Definition 8. Let g\ : [{0, -> {0, 1}, Q2 : [{0, ^ {0, 1} be two 

boolean functions. Let n = ni + n2- We define the boolean function g\ 0 g2 as 

Vx € [{0,1}"*]" : (pi 052 ) (xi,...,x„) = ©ff2(a;ni+i,---,a:„). 

Now we can state our last theorem. Its proof can be found in [Ma 99 ]. 

Theorem 4. For all boolean functions gi,g2 

max(ncc(5i) + ncc(l - 52),ncc(l - gi) + ncc(ff2)) < ncc(5i 0 52) • 

In particular, if both functions gi and 52 are very hard, then the function gi 0 52 
is also very hard. 

As a consequence we have an another very hard function. 

Corollary 2. For all even n the function /i (xi, . . . ,x„) = (xi < X 2 ) 0 ... 0 
^ Xn) is very hard. 
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Abstract. This is an informal introduction to recent developments in 
the theory of distributed computing, showing how notions from combi- 
natorial and algebraic topology can be used to capture essential 2 ispects 
of distributed computing. 



1 Introduction 

This paper gives an informal description of some recent developments in the 
theory of distributed computing. These developments came about through the 
realization that techniques borrowed from combinatorial and algebraic topology 
could be used to capture essential aspects of distributed computing. We take a 
historical approach, tracing how these developments emerged from earlier work. 
In a sense, topological notions were present from the start, but they were not 
recognized as such, because early work relied on the most elementary of topo- 
logical properties: connectivity. Only later was it realized that more advanced 
topological notions could be used to attcick harder problems. 

This paper is not intended to be a comprehensive survey, and it omits many 
important results. Instead, our goal is to provide an intuitively appealing intro- 
duction to the new approach, and to meike it more accessible to non-specialists. 
We will also describe how the topological approach has motivated new models 
and techniques as well as providing a deeper understanding of certain classical 
results. 

Perhaps the most important goal of the Theory of Distributed Computing is 
to understand which tasks can be solved by a distributed system, and at what 
cost. The answer depends on the task itself, as well as the specific assumptions 
made about the system, such as the type and number of faults, the degree 
of asynchrony, and the communication mechanisms (such as message passing, 
read/write memory, or other shared memory synchronization operations). This 
paper is organized as follows. In the next section, we describe two of the best- 
studied distributed computing models. In Section 3, we give an overview of 
the classical connectivity-based results. These results were the precursors of the 
modern topological approach, which is described in Section 4. 
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2 Models of Distributed Computing 

A distributed system consists of n + 1 processes that communicate with one 
another to solve a task. In this paper, we consider decision tasks, in which each 
process starts with a private input value, the processes communicate among 
themselves, and then each process chooses an output value and halts. 

The first model we consider is the synchronous message-passing model. In 
this model, computation proceeds in a sequence of rounds. In each round, a 
process sends messages to the other processes, receives the messages sent to 
it by the other processes in that rotmd, and changes state. All processes take 
steps at exactly the same rate, and aU messages are delivered with exactly the 
same time. Historically, this model was the first to be studied, and many of the 
classical results were derived in this context. Although this model is now widely 
considered to be unrealistic, it is still worth studying because lower bounds for 
this model extend to more realistic models, and the simphcity of the model severs 
to illuminate a number of basic principles of distributed computing. 

We also consider the asynchronous shared-memory model. In this model, 
there is no bound on the amount of time that can elapse between process steps, 
and processes communicate by reading and writing variables in a shared memory. 
While this model is more realistic than the synchronous message-passing model, 
it can still be criticized as an imperfect reflection of current practice. Modern 
multiprocessors typically provide synchronization operations, such as test-and- 
set and load-locked/ store-conditional, that are more powerful than simple read 
and write operations. Nevertheless, this model also rewards study, as many of the 
modern topological results are best illustrated in this context. We will discuss 
below the effects of appending more powerful (and more realistic) primitives to 
this model. 

The state of a system consists of a local state for each process, a global state 
consisting of each of the local states augmented by em environment that captures 
other relevant information, such as the state of the shared memory, or messages 
in transit, or the messages in trjinsit. An event occurs when a process executes 
some significant action, such as sending or receiving a message, or reading or 
writing a shared variable. An execution is an alternating sequence of states and 
events. 

A crash failure occms when a process halts without warning. An important 
parameter of a model of computation, typically denoted by /, is the maximal 
number of processes that can crash. A protocol is a program that solves a task. We 
are interested in protocols that tolerate / or fewer crash failiures; such protocols 
are called f -resilient. When f is n out of n -I- 1 processes, the protocol is called 
wait-free, because any process can finish without waiting for any other process. 

3 Connectivity 

The consensus task is perhaps the best-studied problem in theoretical distributed 
computing (e.g. [17,18,20,27,42]). In the simplest form of this problem, each 
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process starts with a private binary input (either 0 or 1). The processes decide 
on a binary output satisfying the following conditions: 

— Termination: Every non-faulty process eventually chooses a value, 

— Agreement: All non-faulty processes decide on the same value, and 

— Validity: The value chosen is some process’s input value. 

The validity condition implies that if all initial values are the same, then every 
non-faulty process decides that value. 



3.1 Impossibility and Indistinguishability 

The distributed computing literature encompasses a dazzling variety of results 
characterizing the circumstances imder which consensus is or is not solvable, and 
lower bounds on the complexity of solutions when they exist. At the heart of 
nearly all such impossibility arguments is the notion of the indistinguishability 
of distinct global states to a process. Briefly put, two global states x and y are 
indistinguishable to process p if p has the same local state in both, denoted 



X ~p y. 

For example, consider global states x and y in the synchronous message-passing 
model, where two distinct executions lead to those states. If p has the same initial 
state in both executions, and receives the same sequence of messages, then x and 
y are indistinguishable to p {x y). A similarity chain is a sequence of states 
such that any two consecutive states are indistinguishable to some process. 

Similarity chains are central to the classical analysis of asynchronous consen- 
sus. If X ~p y, where p is a non-faulty process and x and y me final global states 
of a consensus protocol, then p must decide the same value in both x and y 
(because it cannot distinguish between them). The agreement condition on con- 
sensus implies that all non-faulty processes decide the same value in both states. 
A simple inductive argument shows that if x and y are related by a similarity 
chain, then all non-faulty processes must decide the same value in both states. 

Fischer, Lynch and Paterson [20] proved in 1985 that there is no consensus 
protocol in an asynchronous message-passing system where even one process can 
fail (i.e., / = 1). We follow established custom by referring to this result as the 
FLP proof. (Later, Dwork et al. [14] cind Loui and Abu-Amara [37] extended 
this result to asynchronous read/ write memory.) 

The fundamental idea underlying the FLP proof is the following. The agree- 
ment condition ensmres that if any process decides a value v, they all do, so 
we can speak of an execution deciding value v. Assume by way of contradiction 
that we have a consensus protocol. A globed state is 0-valent if every execution 
that passes through that state decides 0, and similarly for 1. A global state is 
univalent if every execution passing through it decides the same value, and it is 
bivalent otherwise. Clearly, no protocol can terminate in a bivalent state. 

The first step is to show that any consensus protocol must have an initial 
bivalent state. Let xq (and x{) be the initial state in which all processes have 
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input 0 (and 1). Clearly, xq is 0- valent, and xi is 1-valent. Now suppose by way of 
contradiction that all initial global states are are univalent. Let j/j be the initial 
state in which processes 0, . . . ,n — i start with input 0, and the rest start with 
input 1. By construction, yo = xq and Vn+i = xi. We now argue by induction. As 
noted above, j/o is 0-valent. Assmne, as induction hypothesis, that yi is 0-valent. 
Let Zi be a global state reached by starting in global state j/i, crashing process i 
before it takes any steps, and running the protocol to completion by execution 
e. Let Zi+i be a global state reached by starting in global state yi+i, crashing 
process i before it takes any steps, tind running the protocol to completion by 
the same execution e. Since the only process that can distinguish the initial 
states j/i and j/j+i crashes before sending any messages (or writing to shared 
memory), the global states Zi and Zj+i must be indistinguishable to every non- 
faulty process, so all such processes must decide the same value in both. It follows 
that all processes must decide the same value in yo = Xq and j/„ = xi, a clear 
contradiction. 

The second step in the FLP proof is to show that any bivalent state x can 
always be extended to another bivalent state, implying that the protocol can 
be made to run forever, a clear violation of the termination condition. Start 
the protocol in a bivalent initial state, and run it for as long as possible in a 
bivalent state. Say that an operation O is pending in global state x is some 
process is about to execute O. For brevity, we restrict our attention to shared- 
memory reads and writes (the analysis for message send and receive is essentially 
the same). If x is bivalent, then as long as some pending operation carries the 
protocol to a bivalent state, execute that operation. Because the protocol must 
eventually terminate, every execution must eventually reach a bivalent state 
where every pending operation carries it to a univalent state. Because x is still 
bivalent, some pending operation Oo carries the protocol to a 0-vEilent state, and 
another pending operation Oi to a 1-valent state. 

The rest is a case analysis. For example, if Oq and Oi are both reads, then 
the 0-valent state reached by executing Oo followed by Oi is indistinguishable 
from the 1-valent state reached by executing the operations in the reverse order. 
The other cases are left as an exercise for the reader (hint: you can fail at most 
one process). 

This elegant argument, due to FLP, may appear to need no fmrther elabora- 
tion. Nevertheless, it is instructive to recast this proof in the following geometric 
way. Each global state is a vertex in a graph G. Two global states are linked 
by an edge if they axe indistinguishable to some process. If every initial state 
is univalent, then we can color each vertex with its eventual decision value The 
global state xq in which all processes have input zero is colored with zero, and 
the corresponding global state xi is colored with one. The FLP proof shows that 
there is a path in G linking xq and xi. 

Lemma 1 (1-dimensional Sperner). Consider a graph in which each vertex 
is colored with a binary color. If the graph encompasses a path from vertex Xq 
to vertex X\, and xi and Xq are colored with different colors, then two adjacent 
vertexes in the path are colored with different colors. 
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The proof is a simple counting argument, showing that an odd number of edges 
in the path have two colors. This lemma implies that if all initial states are 
univalent, then there are two initial global states, yi and j/i+i, differing only in 
the input to some process p, such that pi is 0- valent, and j/i+i is 1-valent. As we 
have seen, this observation leads to a contradiction. 

The second part of the FLP proof can also be recast in in graph-theoretic 
terms. Each global state (not just the initial states) is a vertex in a graph. Two 
vertexes (global states) are linked by an edge if they axe indistinguishable to some 
process. As argued above, if vertex x is univalent (or bivalent), then so is every 
vertex connected to x by a path in the graph. The FLP proof (reinterpreted) 
argues that if vertex x is bivalent, but some pending operation leaves system in 
0- valent global state xq, while another leaves the system in 1-valent global state 
xi, then xo and xi are connected by a path in the graph, implying that they 
have the same valence (or none), a contradiction. 



3.2 Consensus in synchronous systems 

In a synchronous message passing system, consensus can be solved in / -I- 1 
rounds. In 1982 Dolev and Strong [16] proved that it is impossible to solve 
consensus in fewer than /-f-1 rounds. The argument also uses indistinguishability 
with similarity chains, but in a different way. The proof assumes by way of 
contradiction that some protocol finishes in / rounds, and finds a similarity chain 
from a failure-free execution where aU processes start with 0, to a failure-free 
execution where all processes start with 1. We illustrate this construction with an 
example in which n > 2 and / = 1. Consider a one-round execution e in which all 
processes start with input 0. Now consider the execution e' identical to e except 
that po fails to send a message to pi. If e leaves the protocol in global state x, 
and e' in global state x', then x and x' are indistinguishable to p 2 - Continuing in 
this way, we remove one by one, each of the messages sent by po, constructing a 
similarity chain from the failure-free execution to the execution in which po fails 
cleanly (without sending any messages). Now consider the execution in which 
Po fails cleanly with input 1. One-by-one, replace each of the messages sent by 
Po, until we reach an execution ei in which po starts with input 1, the rest start 
with input 0, and no failures occur. All processes decide 0 at the end of e, and 
e is linked to ei by a similarity chain, so till processes must decide 0 at the end 
of Cl. Continuing this construction, however, we can replace each 0 input with 
a 1, ending up with a failure- free execution e„ in which all processes start with 
1. Since e and e„ are finked by a similcurity chain, the processes must decide the 
same way in both executions, a contradiction. 

A slightly more complicated construction is needed for the multi-round case, 
but the basic idea is to establish connectivity between the failure-free execution 
in which all processes start with 0, and the one in which they start with 1. 
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3.3 Approximate Agreement 

Before turning into the more generaJ question of characterizing the tasks that 
are solvable in the presence of a single failure, let us consider a relaxed form of 
consensus, called e-approximate agreement (e.g. [15]). In this task, the inputs to 
the processes are real values. Processors must choose real decision values within 
e of each other, and each process’s decision must lie within the range of the 
initial inputs. 

A simple and elegant algorithm for approximate agreement can be described 
in the shared memory model using atomic snapshots. An atomic snapshot object 
is an array of registers, in which each process is assigned an array element. A 
process can write its own element, and atomically read the entire array. Atomic 
snapshot objects can be implemented using read/write variables in the presence 
of cmy number of failmes [5|. In the approximate agreement algorithm, a process 
repeatedly writes its current proposed decision value (initially its input), and 
takes a snapshot. For the next iteration, its proposed value is the average of the 
proposed values it read in the last snapshot. Different atomic snapshot objects 
are used for each iteration. The number of iterations needed depends on e. It is 
easy to check that in each iteration the range of values held by the processes is 
divided by 2. 

The Number of Different Values. This algorithm works with any number of 
failures. If / is known, however, then a process can take repeated snapshots until 
it observes at least n - / 4- 1 proposed values, ensuring that at most / + 1 distinct 
decision values are chosen in amy execution. Thus, the first n — f processes to 
finish an iteration will see each other proposed values, and will compute the 
same value for the next iteration. The remaining / processes may see additional 
estimates, and hence may compute different values. The total number of values 
will thus be at most / + 1. 

An interesting way of interpreting the approximate agreement algorithm for 
/ = 1 is the following. Consider a graph which is a simple path. Label the 
endpoints 0 and 1, and the other vertices, moving from the 0- vertex to the 1- 
vertex, with evenly distributed, increasing values between 0 and 1. The processes 
start with inputs either 0 or 1. The processes repeatedly write their proposed 
values, which are vertices of the graph, and take snapshots. In each iteration, 
if a process saw two different vertices, it proposes to the next round a vertex 
in the middle of the sub-path joining the two vertices. If it saw only one vertex 
(its own), it stays with the same value. Under this perspective, e-approximate 
agreement is just agreement on a sufficiently long path. Each process ends up 
deciding on a vertex, such that the decided vertices are either the same or are 
joined by an edge. 

Notice that by the FLP impossibility result, any 1-resilient algorithm most 
have at least one execution where at least two different values are decided, and 
hence, in this sense, the above approximate agreement algorithm is optimal. 
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3.4 Solvability of teisks with one failure 

Starting in 1987, in a series of papers, Moran and Wolfstahl [40] and Biran, 
Moran and Zaks [11] embarked in a general study of solvability of tasks in asyn- 
chronous message-passing systems with a single crash failure. This work gen- 
eralized the FLP result from consensus to arbitrary tasks, and gave additional 
insights into the role of connectivity. In [40] a necessary condition for a task to 
be 1-solvable was given. The corresponding sufficient condition appeared in [11]. 
This characterization is based on connectivity, again using similarity chains. As 
a consequence, the problem of deciding when a decision task has a 1-resilient 
solution in the asynchronous message-passing model is computable. Using sim- 
ulations [3] between shared memory and message passing, or using the direct 
approaches of Section 3.6, we get similar results for shared memory. 

Formally, a decision task consists of a set of input vectors, I a set of output 
vectors, O, and an input/output task relation, A. An input vector specifies an 
initial value for each process; an output vector specifies its output value. The 
relation A associates to each input vector x e I, the set A{x) of allowable 
output vectors. Now, we say that a set of vectors is connected if it is possible to 
get from any vector to any other vector by a sequence of vectors, such that each 
two consecutive vectors differ in exactly one component. 

The characterization theorem states that the task is solvable with one failure 
if and only if there exists a restriction A' of A such that for every connected 
X Cl, A'{X) is connected. The introduction of the restriction A' is due to the 
fact that the decisions taken by a protocol have to span only a subset of the 
decision vectors allowed by the task specification A. 

The idea of the necessity in [40] is by reduction to consensus. It is shown that 
if there is a protocol P that solves a disconnected task, then it can be used to 
produce another protocol P' that solves consensus, which is impossible, by the 
FLP result. The protocol P' decides 0 if F ends up in one connected component, 
and decides 1 if F ends up in a different component. 

To show sufficiency, Biran et al. [11] use a form of approximate agreement 
protocol, with f — 1. As noted above, this protocol has the property that the 
processes decide on at most two different values. Given a connected task, a 
process first writes its input to a shared memory ([11] is described using message 
passing, but the ideas are similar), and then takes snapshots until it sees at all 
but one process’s input values. Now, the process chooses an output vector as 
input for the approximate agreement task. If it saw all process’s inputs, that 
is, a full input vector x e I, then it chooses a default output vector y € A{x) 
as input for the approximate agreement. Otherwise, it chooses an output vector 
that is allowable no matter what the remaining input value is; that is, a default 
output vector in the intersection of A{x), for every x that extends the n- vector 
it knows. Finally, the processes execute approximate agreement on the path (as 
in Section 3.3) that joins the two proposed output vectors. 
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3.5 New Perspectives with a Single Failure 

Following the new topological perspective [34], the single-failinre (/ = 1) charac- 
terization can be described without a reduction to consensus. Consider the set 
of protocol vectors, V, one for each final state of the protocol in some execution. 
Each component of a protocol vector consists of a final local state of some pro- 
cess. We can view P as a relation that gives the set of protocol vectors, V{x), 
for each input vector x. 

The main theorem for the characterization consists of showing that arty pro- 
tocol that tolerates one failure has a connected set of protocol vectors, V{x), 
for every x E I. There are several ways of proving such a theorem. Herlihy and 
Shavit [34] use a critical state argument, similar to FLP. Another approach is 
to consider a well structured subset of the executions [6,8,44], and on these 
prove the connectivity property. These proofs are most naturally expressed for 
the wait-free case. A reduction to / = 1 is done using a simulation [9]. Another 
technique is to use a combination of the structured subset of executions, with 
the bivalency argmnent [41], as described in next section. 

By working with protocol vectors, one learns an important lesson: 

The protocol vectors preserve the connectivity structure of the input vec- 
tors. 

This means, roughly, that the connectivity graph of the input vectors and the 
one of the protocol vectors are homeomorphic. Here we clearly encounter topo- 
logical notions for the first time. Two geometric objects are homeomorphic if 
it is possible to deform one into the other by continuous transformations (e.g., 
stretching and bending, but not teairing). Speaking informally, our protocol graph 
is a stretched version of the input graph; a path in the input graph becomes a 
longer path in the protocol graph, but it is still connected. 

In more detail, the decisions tciken by the processes induce a decision map 
6 from V to O, satisfying 5{V{x)) C A{x) for every x E I. Moreover, S is 
“continuous” in the following sense: it maps vertexes to vertexes, and also edges 
to edges. Therefore, it sends each connected subgraph into a connected subgraph. 
The FLP impossibility result can now be clearly understood: the input graph is 
connected, and therefore so is the protocol graph. Thus, edso the image of the 
protocol graph under S is connected. However, consensus requires sending the 
protocol vectors to two disconnected output vectors, the all 0 vector and the all 
1 vector. This is impossible, and hence consensus is unsolvable. 

Remarkably, this is just what Lemma 1 says. Consider a graph which is a 
simple path, P, and a graph C which consist of two isolated vertices, denoted 
0, 1. The coloring of vertices is described by a function / from vertices of P to 
vertices of C. We view P and C as geometric objects, and / as instructions on 
how to “bend” P into C. If it is required that / sends the endpoints of P to 
different vertices of C, the Lemma says that at least one edge of P will have to 
“jump” from one vertex of C to the other. As an exercise, it is easy to verify that 
all these arguments still hold if C consists of two disconnected graphs, called 0 
and 1. 
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3.6 Unifying Consensus-Style Results 

We mentioned at the end of Section 2 that one of the new insights gained by the 
topological approach is to better understand the relationship between different 
models of distributed computation, in particular in three papers [21, 33, 41]. We 
describe here the one-dimensional approach of [41]. Moses and Rajsbainn [41] 
presented a unified framework to study solvabihty questions with 1 failvne in 
asynchronous message passing and shared memory systems, and the relations 
to synchronous systems, simplifying and imifying previous results. In particular, 
they show that task solvabihty in ah these situations depends only on connectiv- 
ity, and hence is decidable. Inspired by the more recent perspectives, the work 
is based on the notions of 

— sub-models — considering only a well-structured subset of the executions, 

— round-by-round analysis — the structure of the sub-model is synchronous, 
and can be seen as consisting of rounds, 

— connectivity vs. bivalency notions — using FLP arguments in a synchronous 
setting. 

Initially, the present an abstract model of computation, which is later in- 
stantiated to various classic models of message-passing and shared memory. The 
model is based on the notion of a system which is simply a set of runs, i.e., 
sequences of states with a specification of which process are failed in each run. 
As in Section 2, am environment encompasses the state of the communication 
mechanisms. The general argument is: (i) if a set of states is “connected,” and 
(ii), it contains a 0- valent and a 1-valent state, then it must contain a bivalent 
state. 

Besides the usual connectivity notion in terms of similarity chains, another 
notion based on “potential” value of a state is used, which considers possi- 
ble future decision values from that state. This notion is useful in unifying the 
treatment for the different concrete models of computation. The previous gen- 
eral argument can be used to get an abstract impossibility proof for consensus. 
Show (as in Section 3) that there is an initial bivalent state. Then, that the set 
of successors of every state is connected. Therefore, by induction, construct a 
run that consists of bivalent states. Fineilly, show that consensus is not solved 
in this run, because a consensus algorithm cannot terminate while in a bivalent 
state. There are various technical details that need to be taken care of, but this 
is the general idea. For example, this immediately yields a very simple bivalency 
proof of the / -I- 1 synchronous lower described above. 

Now we can revisit the FLP bivEilency argument. Recalling the protocol vec- 
tors framework of the previous section, and in particular property 3.5, we have 
that an asynchronous consensus algorithm cannot terminate in a bivalent state 
X because the states in the futme of x are connected and but have to be sent 
to something disconnected: an all O’s and an all I’s vectors. However, notice 
that this argument fails in models for which consensus can be solved, because a 
bivalent state may have successor states that are not connected. 
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To facilitate impossibility proofs for concrete models, and to expose the simi- 
larities between them, the notion of layering is introduced. In particular, layering 
avoids the need to ensure that at most one process crashes in any bivalent run. 
Instead of trying to prove connectivity properties of all executions in a model, 
only a well-structured subset of the executions of the model is considered, which 
include a very small degree of asynchrony. Given a state x of an algorithm, con- 
sider a set S{x) of, not necessary immediate, successor states. These are thought 
of as “the next layer” of computation. For instance, in the asynchronous shared 
memory model, layering facilitates the fairness arguments to guarantee at most 
one failure. One such layering is called permutation layering, and is obtained 
by having either n — 1 or all n processes execute their pending operations, in 
a linear ordering. Thus, in this layering, S{x) consists of all states obtained by 
performing at least n — 1 pending operations in all possible linear orderings. 
The two properties needed of a layering to give the consensus impossibility are 
very easy to show. First, that S{x) is connected, and second, that the sub-model 
defined by these runs still allows at most one process to crash. 

This ideas are used to obtain the necessity part of characterization theorems 
for solvability in the asynchronous models, and in the /-resilient synchronous 
model; sufficiency is obtained using approximate agreement. 



4 Higher Degrees of Connectivity 

When we go from 1-resilient computing, to /-resilient, for /> 1, we must replace 
the one-dimensional notion of connectivity with higher-dimensional notions. In 
topology, these notions arise in the study of geometric objects which can be con- 
tinuously deformed. That is, for a topologist, a disk and a triangle are essentially 
the same object, i.e. homeomorphic, because one can be continuously deformed 
into the other without tearing it apeud;. 

In particular, the size of an object is not interesting because it can always 
be shrunk or expanded to the desired size. On the other hand, cin object which 
consists of two separate pieces is never homeomorphic to one which is connected. 
A sphere and a torus cannot be homeomorphic, because one has holes, and the 
other does not. Holes can be of different dimensions, as we shall see. Whether 
two objects are homeomorphic depends on the dimension and structure of their 
holes. 

The ability of a system to solve tasks depends on the topological structure of 
the runs of the system, and if it is “compatible” with the topological structure of 
the task itself. As we have seen, in the case of / = 1, “compatible” means that a 
system can solve a task if and only if both have the same connectivity structure. 
For / > 1, the compatibility has to be at connectivity in higher dimensions. As 
we shall see, the notions described in the previous section are nicely generalized 
for the case of / > 1. 




180 Maurice Herlihy and Sergio Rajsbaum 



4.1 Set-consensus 

In 1990, Soma Chaudhnri [12] proposed a simple generalization of consensus 
in which more than one distinct decision value can be chosen. This problem 
later triggered the discovery of the fundamental role of topology in distributed 
computing. The termination and validity conditions of consensus are maintained, 
while the agreement condition is relaxed. In the k-set-consensus problem, k < n, 
each process gets a value from the set {0, 1, . . . , n — 1}, and has to decide on a 
value satisfying: 

— Termination : Every non-faulty process must at some point irreversibly de- 

cide some value; 

— Agreement: All non-faulty processes decide on at most k different values; 
and 

— Validity: Every decided value must be the input of some process. 

Set-consensus has the interesting property that it can be solved in a suffi- 
ciently reliable system; k > f + 1- Moreover, it can be solved with one iteration 
of the approximate agreement algorithm of Section 3.3; the processes take snap- 
shots until at least n — / input values are read, and then decide on one of those 
values, say the median. Clearly, this algorithm terminates in a system with at 
most f failiures, and validity is satisfied. As discussed in Section 3.3, the number 
of different values is at most f+l, and hence the agreement condition is satisfied. 
Also, for / = 1, it follows that 2-set-consensus is solvable, and as mentioned in 
that section, by the FLP result, 1-set-consensus, i.e. consensus, is not solvable. 
To show that, in general, fc-set-consensus is not solvable if < / -f 1, we need to 
deal with higher-dimensional versions of connectivity, and of Lemma 1. We do 
this next. 

4.2 Topology 

A graph can be seen as a collection of 1 element sets (vertices), and of 2 element 
sets (edges), closed imder contention, since if {u,v} is in the collection, so are 
{u} and {u}. A simplicial complex (or simply a complex) is a collection of sets, 
called simplexes, closed under intersection. Thus, the subsets of a simplex, called 
faces, are included in the complex. The dimension of a simplex, d, is one less 
than its cardinality, denoted d-simplex. The 0-simplexes are also called vertices. 
In an W-dimensional complex the maximum dimension of a simplex is N, and 
every simplex is contained in an W-simplex. 

A simplex has a simple geometric interpretation, as the convex hull of n -I- 1 
independent points in a Euclidean space. A complex is a discrete approximation 
to a geometric object such as a surfece or solid. For example, a 2-dimensional 
disk is homeomorphic to a 2-simplex (a solid triangle). 

It is often useful to subdivide a complex, replacing each simplex with a com- 
plex that occupies the same point set. Subdivisions play an important role in 
distributed computing, because a subdivision of a complex preserves all its struc- 
ture. We start with the generalized version of Lemma 1. Notice that in an N- 
complex which is a subdivided W-simplex every N — 1-simplex is contained in 
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either one or two iV-simplexes. For example, in a subdivided triangle, every edge 
is contained in either one or two triangles. The N — 1 simplexes contained in 
exactly one AT-simplex are the boundary of the complex. We state only the two 
dimensional version of Lemma 1, because it gives very precisely the intuition of 
the higher dimensional version. In this case, we consider an arbitrarily subdi- 
vided triangle. It has three vertices in its corners, and its boimdary is a graph 
consisting of three paths connected at the corners, forming a cycle. The inside 
of the triangle consist of many little triangles. 

Lemma 2 (2-dimensional Sperner). Consider a 2-dimensional subdivided 
simplex, K^, such that every vertex is assigned a color from {0,1,2}, (i) the 
comers are assigned different colors, and (ii) the vertices in the path connecting 
two comers are colored with the colors of these comers. Then there is at least 
one 2-simplex colored with all three colors. 

The proof of Lemma 2 is an elementary parity counting argument, and uses 
the fact that there is an odd number of edges in the boundary colored 0 and 1, 
as implied by Lemma 1. The proof shows that the number of triangles colored 
0, 1, 2 is odd. For example, see [6, 10]. 

Although Lemma 2 can be proved using just combinatorics, a nice intuition 
for this lemma comes by proving it using topological arguments. This proof is by 
viewing it as folding one object into another, as in Section 3.5. The first object 
is the subdivided simplex, K^, and it is folded into a triangle, 5^, whose three 
vertices are called 0, 1,2, subject to the restriction that the boundary of is 
mapped to the bovmdary S^. The lemma says that, since has no holes, then 
at least one of its triangles will have to go to the 2-simplex of S^. 

Holes. The role of the holes of a complex is very important for distributed 
computing (and for topology). A complex has no holes of dimension d if any 
continuous map of the d-sphere into that complex can be “filled in” by extending 
it to a (d -H l)-disk. For example, a complex has no holes of dimension 0 if it 
is connected: a 0-sphere consists of two disconnected points, and the 1-disk is a 
path between them. A subdivided simplex has no holes of any dimension, and 
this is important in the proof of Lemma 2. In general, a complex is solid if it has 
no holes of any dimension. 

4.3 Asynchronous Wait-free Solvability 

We now describe how to use topologiccil notions to model decision tasks and 
protocols. A task consists of cui input complex, I, an output complex, O, and 
an input-output relation A, from I to O. Each simplex in I specifies a set of 
input values to the processes, and each output simplex in O specifies a set of 
output values. If S' e T, then A{S) C O specifies the allowable outputs when the 
processes start with inputs from S. If S*^ € I is of dimension d, then it specifies 
inputs for d-|-l processes, when they run to completion before the other processes 
take any steps. In this case, A{S^) C C7 is a subcomplex of dimension d. 

The protocol complex, V, consists of an n — 1-simplex for each final state of 
the system, and all their faces. Each vertex of a simplex is labeled with a local 
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state of a process in the execution corresponding to that simplex, representing 
the local view of that process in the execution. A simplex corresponds to a set of 
executions, which axe indistinguishable to the processes of that simplex. K the 
simplex is of dimension d, it represent a set of executions where d+1 processes 
observe operations from each other, but not from the other n - (d+ 1) processes. 

When d+1 processes start with inputs from S I, V{S'^) is the subcomplex 
of P of all states corresponding to executions starting in S'^. Now, the decision 
of the processes induce a simplicial decision map S from V to O. This map sends 
a vertex of V of some process with some decision value, to a vertex of O labeled 
with the same process and output value. The map 5 is simplicial because it sends 
simplexes to simplexes, i.e., intuitively, it is continuous, and deforms V into O, 
with the restriction: 

For every S el, 5{P{S)) C 0{S). 

The following fundamental result is implied by [8, 34, 44]. The protocol com- 
plex V{S), for every 5 € I, of an asynchronous wait-free system is solid. This 
implies that V preserves all the structure of J; V looks like a subdivision of I. 
This claim is the “asynchronous computability theorem” of Herlihy and Shavit 
[34,35]: 

Theorem 1 (HS). A task {1,0, A) is wait-free solvable if and only if there is a 
subdivision of I, and a simplicial map f from I to O, such that f sends vertices 
of one process to vertices of the same process, and f{S) 6 ^(5), for every S el. 

That is, (I, O, A) is wait-free solvable if and only if I can be stretched and 
bent into O satisfying the requirement of A. Depending on how difficult it is 
to solve the task, how fine the subdivision must be, and how large is the time 
complexity of the algorithms. 

Herlihy and Shavit proved the necessity of Theorem 1 in [34], using critical 
state arguments in the style of FLP. A alternative approach is to consider struc- 
tured subsets of executions (as in [6,8,44]) where it can be directly shown that 
a subdivision is induced. The necessity part [35] is using a form of approximate 
agreement. 

In a sense, the wmt-free case is fundamental. The general case of 1 < / < n 
states that the protocol complex has no holes below dimension /. Two ap- 
proaches for proving this result have been suggested in [34] with a critical state 
argmnent [34] and via a reduction [9] to the wait-free case using simulations [8]. 

Analogous properties in other models of distributed computation, like syn- 
chronous and asynchronous message passing, and stronger communication prim- 
itives in shared memory like set-consensus, see for example, [21,28,33,41]. 



4.4 Applications 

Set-consensus. We are ready to see why fc-set-consensus is unsolvable, when 
fc < / + 1. We first consider the wait- free case, i.e., f — n, and k < n + 1. This 
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case captures the main ideas; the general case can be proved by reduction [9]. 
For this example, we consider only the three-process case. 

Consider an input 2-simplex, S^, where processes po,Pi,P2 start with values 
0,1,2. As described in the previous section, V{S^) is a subdivision of 5^. For 
each i, if pi runs solo, i.e., finishes its computation before seeing operations by 
the other processes, it has to decide i. Thus, the i-th corner of V{S^) is colored 
with decision value i. Similarly, in any execution where Pi,Pj see only each other, 
they have to decide either i or j, and hence all vertices along that boundary of 
V{S^) are colored either i or j. All internal vertices of V{S^) are colored with 
any of the decision values 0,1,2. It follows that this coloring induced by the 
decision values satisfies Lemma 2. It follows that there must be at least one 
simplex colored with all three colors, and hence an execution where three values 
are decided. Thus, k cannot be less than 3. This cirgument readily generalizes to 
n processes, f = n — 1, and a Sperner argument for shows that in at 

least one execution n -f- 1 values are decided, implying that k > / + !• 



4.5 Simplex Agreement and Convergence Tasks 

Recall the approximate agreement task of Section 3.3, where processes start with 
real values and have to decide on real values e apart from each other. In the 1- 
dimensional discrete version of the problem, processes start at the end-points of 
a path, and have to converge to a single vertex or to two vertices joined by an 
edge. In the d-dimensional version, processes start at the corners of a subdivided 
d-simplex, and have to decide on (not necessarily distinct) vertices contained in 
a simplex of the subdivision. Thus, processes start with at most d -|- 1 input 
values (vertices in the corners of the subdivision), and have to decide on at most 
d -I- 1 output values that form a simplex. This task is wait-free solvable using 
the algorithm described in Section 3.3, by executing rounds of snapshots; the 
number of rounds depends on the level of the subdivision. It is a consequence 
of the set-consensus impossibility result that processes can become arbitrarily 
close, but can never agree exactly: in at least one execution, at least d-|-l different 
values are decided. 

Most of the results obtained using topology are impossibility proofs. Various 
possibility results [30, 31] have to do with the solvability of approximate agree- 
ment on more arbitrary spaces. In a convergence task [30], processes start on 
vertices of a given complex, and have to converge on vertices of the complex 
contained in a simplex, subject to some restrictions. The restrictions specify 
the part of the complex where they can converge, depending on where they 
started. Herlihy and Rajsbamn [30] presented a generic algorithm to solve this 
task, and gave conditions specifying when the algorithm solves the convergence 
task, depending on the complex and its restrictions, under various asynchronous 
shared memory systems. The idea is to use the discrete approximate agreement 
algorithm described above, and then use topological arguments to show that the 
processes can map the values obtained by this algorithm to decisions in the given 
complex. 
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Undecidability. A nice application of the wait-free solvability characterization 
is proving that there is no algorithm that takes a decision task specification as 
input and tells if there is a wait-free eilgorithm that solves the task [22, 30] (the 
same result holds in other models [30]). 

This undecidability result is by reduction to the classic contractibility prob- 
lem in topology: it is undecidable whether an arbitrary loop in an arbitrary finite 
complex can be contracted to a point. The result holds even if the complex is 
2-dimensional, and the loop is simple. 

The idea of the reduction is to show that a the loop is contractible if and 
only if there is a wait-free solution to a convergence task on the complex with 
the following restrictions. Take three distinct, distinguished vertices on the given 
loop. If all processes start on the same distinguished vertex, they all must decide 
that same vertex. If they all start with two distinguish vertices, they have to 
decide on vertices spanning a simplex on the sub-path of the loop connecting 
the two vertices. If the processes start with three different distinguished vertices, 
they can decide on any vertices, as long as they are contained in a simplex of 
the given complex. 

The proof of this claim follows from the wait-free solvability characterization, 
and the fact that if a loop is contractible then there is a simplicial map <5 from a 
sufficiently fine subdivision of a simplex to the complex, sending the boundary of 
the simplex to the loop. Then, solving approximate agreement on the subdivided 
simplex, and then following the map 5 to decide vertices of the given complex. 
On the other hand, if the convergence task is solvable, then the fact that the 
complex is solid implies that there is a contraction of the loop, because the 
restrictions of the convergence task force the boundary of the protocol complex 
to be mapped to the loop. 
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Abstract. In this paper we propose an improved disjunctive strictness 
analysis system based on the work by Jensen ([1], [2]). The original sys- 
tem does not have the subject reduction property. The new system has 
the subject reduction property for parallel reduction and is stronger than 
the original system. 



1 Introduction 

Strictness analysis is a static analysis for lazy functional languages, and similar 
to neededness analysis. Its purpose is to identify those functions which can be 
called “by- value” rather then “by-need” (as is dictated by the lazy semantics) 
without affecting program behaviour. The information obtained by a strictness 
analyzer can be used to improve the efficiency of code generated by compilers 
(both sequential and parallel). 

The problem of inferring precise strictness information is undecidable in gen- 
eral, since it would entail the decidability of the halting problem for Lazy PCF 
programs. Therefore we have to content ourselves with some kind of approximate 
information. Obviously the more precise the information, the better use we can 
mcike of it during the compilation of lazy programs. There are many different 
inference systems for strictness properties and systems incorporating disjimctive 
kind of information are usually much more precise than any other ones. This is 
the reason why we are concerned with disjunctive systems in the first place. Also, 
the field of disjunctive ancJysis techniques has not been thoroughly investigated 
so fax, since this kind of systems is usually harder to deal with than systems 
which do not include disjunctive properties. Thus another motivation for this 
work is to advance the state of knowledge in this field. 

The disjunctive strictness analysis system was introduced in [1] (see also [2], 
which is a journal version of the last chapter of [1]) and, independently, in [3]. 
There axe also papers on disjunctive types for lambda calculus (a good source of 
information is [4]), which are not immediately related to program analysis, but 
provide a broader perspective for use of disjunctions in type systems. 
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We concentrate our investigations on a particular disjunctive system for 
strictness analysis of simply-typed lazy programs that was proposed in [1], but 
om results should be applicable to other systems of this kind. We use this par- 
ticular system, since it has an abstract interpretation which is equivalent to it. 
This is very important, since no syntactic-based inference algorithms have been 
proposed so fax for this system. Therefore, in this case, abstract interpretation 
is the only method that allows to actually use this system in practice. 

However nice the system is, it has certain deficiencies. The first is the lack 
of the subject reduction property. This means that the strictness information 
obtained for a particular program (which is guaranteed to be correct by the cor- 
rectness proof for this system) cannot be used in general for /3-reducts of this 
program (unless one proves that it is still valid after /?- reduction). This makes it 
unnecessarily complicated to prove correct any use of this information for trans- 
formed versions of this program which may be generated by an optimizer inside a 
compiler. In contrast, if the information was preserved during /3-reduction, such 
proofs would not be necessary — one could just use the proof of correctness for 
strictness analysis algorithm. 

Another problem with the aforementioned inference system is that in some 
cases the information obtained is not as precise as it could be. This is undesirable, 
since we are always interested in as precise information as is possible to get — 
this desire led to the invention of disjunctive analysis itself! 

Fortunately, both problems can be eliminated with a single step. This step 
consists of a slight modification to the disjunction elimination rule the system 
includes. The modified system is strictly more powerful than the original one, 
and at the same time it does enjoy the subject reduction property. However, we 
still lack any algorithm for inference of strictness information in this system. Of 
course we cannot directly use the abstract interpretation which is available for 
the original system — being equivalent to it, it is not able to compute some 
properties which can be derived in the new system. To overcome this difiiculty, 
we define a syntactic transformation on the program terms such that abstract 
interpretation of the transformed terms gives exactly the information that can 
be derived in the modified system of the initial terms. This gives us a way of 
mechanically computing the information for the modified system, which is a 
necessary condition of employing it in practice. 

It is also worth noting that the original disjunction elimination rule of the 
system is more reminiscent of the sequent-style formulation of the inference 
system (and all the other rules are given in the natural deduction style). The 
new disjunction ehmination rule follows the rest of the system and is indeed an 
a natural deduction style elimination rule for disjunction. 

The contents of the paper is as follows. In Sect. 2 we define the lazy language 
we are concerned with in the rest of the paper. In Sect. 3 we define the class of 
disjunctive strictness properties for this language and present the inference sys- 
tem of [1]. In Sect. 4 we identify its deficiencies. In Sect. 5 we present the modified 
system and its properties. In Sect. 6 we present the abstract interpretation for 
the system of [1] and show how to use this abstract interpretation to compute 
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the information which can be inferred in our modified system and prove that 
the inference system and the abstract interpretation are equivalent. Sect. 7 gives 
suggestions for future work. 



2 The Lazy Language 

This section describes the language we shall be working with. Types (ranged 
over by a, r) are built from the base tjq)e t (representing the type of integers) 
by means of the function space constructor — The language of types 71 has the 
following syntax: 



a l \ a ^ a 

We assume that for each type a there exists an infinite supply of variable names 
Terms of t3qje a are ranged over by 

The following grammar describes the set A* of terms (with the typing rules 
built into term formation rules): 

t ::= n‘ I I \ (Ax'^.r)''^^ | 

It is not difficult to add other primitive types, like booleans or characters, or 
composite types such as products. These au'e left out of consideration for clarity 
of presentation. 

It is also easy to give a (lazy) denotational semantics for the language of 
terms in the standard way, so again we omit the definitions. 

3 The Disjunctive Strictness Inference System 

This section presents the definition of disjunctive strictness properties and an 
inference system introduced by Jensen in [Ij. 

For each type a of the base language, we define the finite set of strictness 
properties L„. The formation rules for strictness properties are defined in Fig.l. 
For each of the families L„ we define the entailment relation <„. The inference 
rules for the entailment axe given in Fig.2 and 3. We take the liberty of omitting 
type subscripts when they are cleair from the context. 

Figure 4 gives the logical (constant-independent) rules of inference, while 
Fig. 5 gives the rules describing the behaviour of the particular constants included 
in the programming language. 



4 The Deficiencies of the System 

The deficiencies of the presented system cire closely related to each other, but 
have different practical consequences. The first of the difficulties is the lack of 
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Fig. 1. Disjunctive strictness properties 
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Fig. 2. The entailment relation for strictness properties 



the subject reduction property. It is demonstrated by the following example. 
For the clarity of presentation we assiune in the following exeimples that our 
programming language contains pairs and projections. The same effect can be 
obtained with the standard coding of pairs in the PCF. 

Assume that plus is the operation of adding two integers (which is strict in 
both components of its argmnent). Assume that Nj_ is a divergent term. Further- 
more assume that f st and snd axe the first and second projection, respectively. 

(Ax. plus (f St x) (sndx)) (if 1 then(l, Nx) else (Nx, 1) f i) 

The disjimctive system presented above allows us to infer the property ft for the 
shown term, proving that it is divergent. However when we perform a /3-reduction 
step, we obtain the following term: 



plus (fst(if lthen(l,Nx) else (Nx, 1) fi)) 
(snd(if 1 then(l,Nx) else (Nx, 1) f i)) 
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< ttr tx 



to- — t fr < fo-»T [f — ^] 

<j>2 < (j>i tpi < ij)2 
{4>i rpi) <(<h~^ i’T.) 

{<j> ip\) /\{4> ^> 2 ) < (<A V"! A V>2) [->a] 

{<j)\ Ip) A {<p2 — t V*) < (<^1 V <p2 Ip) V] 



Fig. 3. The entailment relation for strictness properties (continued) 



For this term we can only infer the property t^. Note however, that the latter 
term remains divergent according to the semantics of the language. 

This is a problem if we want to use this inference system in a real compiler and 
prove this compiler correct. All optimizing compilers for lazy languages perform 
many syntactic transformations on programs being compiled and some of the 
transformations correspond to /3-reductions. Therefore, in general, the strictness 
information obtained for any particular program cannot be obtained for a trans- 
formed version of the program. If we want to have a compiler which is provably 
correct, we have to prove that any transformation applied during compilation 
preserves the information that is used later in the compilation process. For a 
strictness anailysis system which enjoys a subject reduction property we can use 
this fact once and for all, but in the case of the presented system we have to 
prove each of the transformations correct with respect to the strictness inference 
system, which unnecessarily complicates the task of verifying the compiler. 

The lack of the subject reduction property implies also another, more serious 
problem — for some terms it is possible to get better strictness information after 
applying a expansion. However we cannot use this fact in practice, since the 
nmnber of potential /3-expansions which c^ln be applied to any term is infinite 
and it is not apparent which of them are necessary to get the best strictness 
information possible. 



5 A Modified Inference System and its Properties 

In this section we present a modified system and prove that it does not have the 
shortcomings of the original system mentioned in the previous section. 

The modified system is obtained by taking the rule disj ’ (see Fig.6) instead 
of the disj rule. This rule is used in the system of disjunctive types considered 
in [4] and in other papers on disjunctive types. It seems much more natural than 
the rule disj as it does not introduce unnecessary limitations to the contexts 
where we can do disjunction elimination. 
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71c , x" : H x" : <p‘^ [var] 

r h : 0" -4 ri- : fli'' 
r h : V'^' 

A.x'' : h r : V’’’ 



[app] 



Tx h (Ax'^.t’’) -.<f -^i) 

rv-t” ■. 4 >'’ rht” :ii'’ 



- [abs] 



[conj] 



r^t” -.4,'^ hrl)” 

r^,x^ ■4>" Tx.x’- :V2 I- r 



Tx,x’' rV-i VV>2 l-t*" 
rh e" ; 4>'’ <a rP” . 



[disj] 



r he"’ :■>!)’’ 

Fig. 4. The logical rules 



sub] 



This change, although small, makes a big difference in the essential properties 
of the system. We denote the provabiUty of the properties in the modified system 
by the symbol h'. 

In this section we investigate the following essential properties of the modified 
system: 

— correctness with respect to the staindard semantics of the programming lan- 
guage; 

— the subject reduction property for parallel /^-reduction; 

— strictly better expressive power thcin the original system. 

For our programming language we define the standard denotational semantics 
(as for Lazy PCF). In this semantics each type a is interpreted by the semantic 
domain D„. 

In order to state the correctness theorem we need to define the semantics of 
the disjunctive properties. 

Definition 1 (semantics of the disjunctive properties). 

|f<cl = {-Ld,} 

[t.l = 

U A ipl = m n IV’II 

10 V i/>] = 101 U [01 

[0. ^ 0rl - {/ e D,_.|/[0,,I c 10,1} 

In the statement of the correctness theorem and in its proof we use the fol- 
lowing notation. Let p be a semantic environment, F a syntactic environment 
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r h n‘ : t. 



r h s'- : 
r h pius‘-*‘-"‘ 



r i- s'- : Ip'- 

s‘ t‘ ; # A t/) 



- [arith] 



r h s* : f, 

r h if then else tj f i : 



[condl] 



r h if then t? else f i : (j>‘^ l^onazj 

r, x‘' : h r 

; ; fix! 

r h f s'") :<jf 



Fig. 5. The rules for constants 



Fx,x^ : V>r H r : F^,x^ : # I- ^ T h s’- : V ^>2 

r h r[«7x1 : r 



[disj’] 



Fig. 6. The modified inference rule for disjunctions 



(with the same domain), f an cirbitrmy term and 0’" and arbitrary disjunc- 
tive property. The notation p F means that for every variable x’^ such that 
€ dom{p), F{x'^) is defined and px’’ € The notation [flp |= <jF 

means that \F\p € |0’"]. The notation F \= F \ (jF means that \F\p [== <jF for 
all environments p such that p\= F, . 

Theorem 1 (correctness of the modified disjunctive system). Let t he a 

term of type a and F be an environment assigning disjunctive properties to all 
free variables oft (in a type-correct way). Let pr be defined as pr{F) = [^^(a;)!. 
Then for an arbitrary disjunctive property <f>„ we have the following 

F t : <j>ff F \= t : 4>a 

Theorem 2 (subject reduction property for parallel /3- reduction). Let 

t be a term of type a and F an environment assigning disjunctive properties to 
all free variables of the term t. Let t' be a term such that t ■ Fhen for 

any property <j>'^ we have; 

F t : => F t' ■. 

The modified disjunctive system is at least as strong as the original system 
since each deduction using the disj rule can be easily transformed into a deduc- 
tion using a fimited version of the disj ’ rule and there axe properties derivable 
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in this system which axe not derivable in the original one (see the example at 
the beginning of the paper.). 

6 An Abstract Interpretation for the Modified Inference 
System 

Now we need to find a way of computing properties in the modified system in 
an automatic way. We do not have a syntactic inference algorithm. Instead we 
find a way of using the abstract semcmtics given for the original inference system 
in [1]. In this section we define the abstract semantics and state its properties 
proved in [1]. Then we define a syntactic operation of sufficient expansion for 
program terms and prove that the abstract semantics for the original disjunctive 
systems applied to sufiiciently expanded terms computes exactly the properties 
that can be inferred in the modified system for the non-expanded terms. 

The abstract domains and the abstract semantics of the disjunctive prop- 
erties is given by Definition 2. Figure 7 defines the abstract semantics for our 
programming language. The notation ^i{D) nsed in Definition 2 means the 
lower powerdomain of the domain D (i.e. the set of all ideals of D ordered by 
set inclusion). 

Definition 2 (abstract semantics of the disjunctive properties). The 

abstract semantic domains for disjunctive properties are defined as follows (A—o 
B is the set of all linear (in the lattice-theoretic sense) functions from A to B): 

A = Iii(some finite lattice of properties) 

A(r-^T = %(A — ° Ar) 

Let <j), € L„ and £ Lr be arbitrary disjunctive properties. 

We define |J^ : L„ -> A^ as follows: 

— some element of A^ 

where 4> is a constant property 

I<Ai A 4>2}-^ = 

I</»i V (t>2ja = I4>i]a u 1^2!^ 

I<A ^ V'la-^r = {f ^A„-o Ar\ /(M<t) C IV'lr} 

□ 

In [1] the inference system and the abstract interpretation are proved equiv- 
alent. The inference system defined in this paper is stronger than the one in [1], 
but it is impossible to define any natuTcd compositional abstract interpretation 
which would be equivalent to the system. To solve this problem we use the un- 
modified abstract interpretation and define a simple syntactic transformation on 
programs such that the abstract interpretation of the transformed programs is 
equivalent to the inference of properties in our inference system. 
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p = p(x") 

p = some element of Aa 

(if is a constsint term) 

itr p = 

U{® 6 At I X = [Xi [^i 

where Vi £ p(xi)} 



[Ax'.tl-^ p = 



u l{<7 e A. ^ At I VL g(L) = U ^;{4]} 

leL 



where Vi € p(xi) 



Fig. 7. The abstract semantics 



The idea of performing syntactic trcmsformations on programs as a prepro- 
cessing phase for static analysis is not new. The reader may consult [5] and [6] 
for a discussion on using a continuation-passing style transformation on lazy 
programs to obtain better strictness information. 

The transformation we are about to present is much simpler and essentially 
tries to recover the lazy reduction semantics (i.e. graph reduction) in the frame- 
work of ordinary term reduction, so that duplicated subterms are eliminated by 
appropriate /3-expansion. 

Definition 3 (sufficient expansion). Letta^ be a term satisfying the following 
conditions: 

1. all free variables oft are different from all its bound variables; 

2. no bound variable is bound again in the scope of its binding lambda abstrac- 
tion; 

3. each two subterms t„, t'^ which are identical modulo a-conversion of bound 
variables are identical (note that we consider the Church-style calculus, so 
any two a-convertible terms are of the same type). 

We define the operation SE(t) by induction on t in the following way: 

1. t = x; we take SE(x) = x 

2. t = c; we take SE(c) = c 

3. t = Xx.s; we take SE(Ax.s) = Ax.SE(s) 

4- t = tit 2 ; if there is a term s which is not a variable and such that it occurs 
both in t\ and in <2 o,nd also no free variable of s is bound in t, then assume 
that u is a minimal term of this property; let t'l and t '2 be obtained from t\ 
and t 2 by replacing all occurrences of u with a fresh variable x; in this case 
we take SE(tit 2 ) = (Ax.SE(tit 2 ))(SE(u)); otherwise (there is no such s) we 
take SE(tit 2 ) = SE(ti)SE(t 2 ) 
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The definition of sufficient expansion we have given defines actually a relation, 
not a function (since it depends on the sequence of choices of minimal subterms 
and the choice of fresh variables). It is much more convenient for omr purpose 
to consider it as a function. Therefore we assume that the choice of minimal 
subterms and the choice of fresh variable names in the definition is performed 
according to some linear ordering on terms (like the lexicographic ordering). 

This transformation is not difficult to implement. Furthermore, it does not 
increase the size of the program (we only abstract out terms which consist of more 
than one symbol, so a simple induction shows that the size of the program does 
not increase during transformations). Thus it can be used in real-life compilers 
for lazy laguages as a simple preprocessing phase to strictness analysis. 

The following two lemmas relate derivability of properties in the systems I- 
and h'. 

Lemma 1 (syntactic soundness of h'). For any term t of type a, any envi- 
ronment r and any property 4>a the following implication holds: 

r\-' t:^„^ r\- SE{t) : (j>a 

Lemma 2 (syntactic completeness of b'). For any term t of type a, any 
environment F and any property <j>„ the following implications holds: 

F b SE(t ) : (j>a => r i-' t : <f>„ 

The following theorems allow us to use the abstract interpretation defined in 
Sect. 6 to compute properties in the system b'. They easy consequences of the 
lemmas proved in this section and the correctness and completeness results of 
[ 11 - 

Theorem 3 (completeness of b' wrt the abstract semantics). Let t be a 
term of type a. Let F and p be environments such that p{x) — 1/^(3;)] for each 
variable x occurring free in t. Then for any property <j)„ the following holds: 

[SE(t)I^ p c ^Fh't ■.<!>, 

Theorem 4 (correctness of b' wrt the abstract semantics). Let t be a 
term of type a. Let F be an environment assigning disjunctive properties to free 
variables oft. Let p be defined as p{x) — |T’(x)J. Then for any property <j>„ the 
following holds: 



rb'f:0<,^[SE(t)l-^pC|0,I'4 

7 Conclusions and Future Work 

In this paper we studied the problem of disjunctive strictness analysis for lazy 
languages. We defined a new system which improves upon the previous work by 
giving more precise strictness information and by having better properties (the 
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subject reduction property for parallel /3-reductions). For this system we defined 
a method of computing strictness information by abstract interpretation. 

However there are still many open problems related to disjunctive strictness 
analysis. Some of the most interesting ones axe the following: 

— finding a (syntactic) inference algorithm for disjunctive properties; 

— adaptation of known optimized abstract interpretation algorithms to the 
system with disjunctive domains and experimental assessment of their effi- 
ciency; 

— proving some kind of polymorphic invariance result for disjimctive proper- 
ties; 

— extension of the disjunctive analysis techniques to various kinds of static 
analysis problems (like binding-time analysis) — new disjunctive domains, 
new disjunctive inference rules for constants. 

We intend to investigate some of them in the neair future. 
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Abstract. We develop an operational theory of higher-order functions, 
recursion, and fair non-determinism for a non-trivial, higher-order, call- 
by-name functional programming language extended with McCarthy’s 
amb. Implemented via fair peirallel evaluation, functional programming 
with amb is very expressive. However, conventional semantic fixed point 
principles for reasoning about recursion fail in the presence of fairness. 
Instead, we adapt higher-order operational methods to deal with fair 
non-determinism. We present two natural semantics, describing may- 
and must-convergence, and define a notion of contextual equivalence over 
these two modalities. The presence of amb raises special difficulties when 
reasoning about contextual equivalence. In particular, we report on a 
challenging open problem with regard to the validity of bisimulation 
proof methods. We develop two sound and useful reasoning methods 
which, in combination, enable us to prove a rich collection of laws for 
contextual equivalence and also provide a unique fixed point induction 
rule, the first proof rule for rezisoning about recursion in the presence of 
fair non-determinism. 



1 Introduction 

First introduced in [12], McCarthy’s amb, or ambiguous choice, has a simple 
informal operational description: evaluate each operand of the choice in fair 
parallel and accept the first to terminate as the result. This process will terminate 
when either operand does, but can only loop when both operands do. 

Fair parallel evaluation is a powerful programming idiom and functional pro- 
gramming with amb is very expressive. It subsumes many other, more commonly 
studied non-deterministic and parallel functional language extensions - e.g., er- 
ratic choice, countable choice, and parallel or are all straightforwardly encoded 
in terms of amb - but it is also substantially more difficult to model than each 
of these. In particular, conventional semantic fixed point principles for reasoning 
about recmrsion fail in the presence of fairness: amb is not even monotonic, let 
alone continuous, with respect to any domain-theoretic partial order respecting 
total correctness. 
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Instead, we adapt higher-order operational methods to give an operational 
theory of algebraic datatypes, higher-order functions, recursion, and fair non- 
determinism for a call-by-name functional programming language extended with 
McCarthy’s amb. This is an interesting and challenging application of such tech- 
niques, since we cannot look to domain theory for inspiration. The theory con- 
tains not only the expected equationctl laws, but also a proof rule for proving 
properties of recmrsive programs contcuning amb: unique fixed point induction. 

The paper is organised as follows. Section 2 surveys related work. Section 3 
presents the language and its operational semantics in the form of natural se- 
mantics rules for may-convergence and must- convergence relations. Section 4 
defines contextual equivalence. Our first major result is that contextual equiva- 
lence includes a simpler Kleene equivalence relation. It is used to estabhsh many 
expected equational laws. Section 5 introduces simulation and bisimulation rela- 
tions. Section 6 is devoted to a cost-sensitive simulation relation, based upon an 
operational notion of cost. Our second major result is that this is a congruence 
and, hence, included in contextual equivjJence. We develop a tick algebra for 
the language and establish the powerful unique fixed point induction proof rule. 
Unique fixed point induction allows us to prove two terms contextually equiva- 
lent by exhibiting a single program context. We demonstrate its use in section 7, 
where it is used to prove properties of bottom-avoiding merge. 

2 Related Work 

Traditionally, one gives meaning to non-deterministic constructs via powerdo- 
main constructions, domain-theoretic analogues of the powerset operator; see 
[21] for an accessible survey. The denotational approach encounters well-known 
problems when attempting to model McC^lrthy’s amb. For example, the Egli- 
Milner ordering is a natural preorder to consider (it combines domain-theoretic 
analogues of may- and must-behaviours), but McCarthy’s amb is not even mono- 
tonic with respect to this ordering, so the theory breeiks down. The only attempt 
at a denotational semantics for McCarthy’s amb, due to Broy [3], is developed for 
a first-order language only, and it is rather cumbersome and complex. Possibly, 
Moschovakis’ powerstructmes [16] cire an alternative ceindidate for a model of a 
higher-order language with fair non-determinism. 

Higher-order co-inductive operational techniques, based upon Howe’s method 
for proving congruence of bisimulation equivalences for fimctional languages [7], 
have been applied to non-deterministic functional languages: Howe [7] and Ong 
[17] consider erratic choice (where one operand is chosen and then evaluated, 
while the other is discarded), showing that notions of simulation pre-orders and 
bisimulation equivalences axe congruences. Lassen and Pitcher [11] prove con- 
gruence of a space of simulation pre-orders and bisimulation equivalences for 
countable choice (which returns a random natural number). Quite different op- 
erational techniques axe used by Moran, Sands, and Carlsson [15] to develop an 
equational theory for call-by-need and erratic choice. Both erratic and countable 
choice axe sequential forms of non-determinism; it is not immediate how to ex- 
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tend any of these methods to the fair non-determinism exhibited by McCarthy’s 
amb. 

Here we extend these methods by using cin eilternative to Howe’s relationsd 
construction (from [9]), and by introducing costs (following Sands’ improvement 
theory for deterministic languages [18]). The unique fixed point induction rule is 
adapted from [18]. Om adaption inspired a similar rule in [15], where it is used 
in an extensive treatment of a theory for erratic choice (that respects sharing) 
for the Pudgets’ stream processor calculus. 

This paper reports on results from the authors’ dissertations [10, 13]. Mostly, 
proofs are sketched or omitted; det 2 iiled proofs may be found in [13]. 

3 The Operational Semantics 

We begin by introducing the operational semantics for a call-by-name A-calculus 
plus McCarthy’s amb. We give natmal semantics for both may-convergence and 
must-convergence. We need to describe must-convergence behaviour since amb 
is distinguished from erratic choice by its must-convergence (or equivalently, 
divergent) behaviour. In anticipation of the development of a cost-sensitive op- 
erational theory, we attribute costs to the two natural semantics. 

The terms in the language are of the form: 

x,y,z S Variable K € Constructor op € Primitive 

M,N::=x\ Xx.M \ M N \ M \^n~^ ] M op N \ K Mi ■ ■ ■ 

I fix M I case M of {Ki x\--- Xrm .^»}r=i 

Constructor is a set of names, disjoint from Variable, which may not be bound 
or alpha converted; we wiU represent lists by assuming a nullary constructor nil, 
denoted by [], emd a binmy constructor cons, denoted by the infix symbol (:). 
Primitive is the set of standmd integer primitives (addition, division, equality, 
etc.)] equality and related primitives map pairs of integers to nullary constructors 
true and false. For each integer n, we have a distinguished value, written '"n"’. 
Other values are lambda expressions and constructed values; all values are ranged 
over by U cmd V. In addition we have explicit recursion and case expressions. 
We denote McCarthy’s amb by the infix symbol []. 

For illustration let us see how other non-deterministic operators can be en- 
coded using amb. Erratic choice, written as an infix ©, makes an initial choice 
between its two operands. The selected operemd is evaluated and the other is 
disccirded. Using amb, we can simulate erratic choice between terms M and N 
thus: 

M®N = ((KM)D(KiV))l 

where K = Xxy.x, and I =' Xx.x. Countable choice, also known as random 
assignment and denoted by ?, can evaluate to any natural nmnber but cannot 
diverge. It can be expressed as: 

? =' fix Ax. 0 [] (x -I- 1). 
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Xx.M Xx.M (Lamjj,n) 

K Ml ■ ■ ■ Mn • 11 ® K Ml ■ ■ ■ Mn ( Constr^n ) 



M^NVV 

M(fixM) F V 
fixM F V 



(Ambi^n) 



(Fix nr,) 



''n''F''n'' (/n4-.) 

M F N F 'T 
M op N ^”‘+" ''i opp ^ ^ 



ATF 

M D AT F U 

MFAar.M' M'[N/^]pV 



MNi)' 



m+n+1 



(Ambnp 

(ApPi^r,) 



M F Kj Ml--- Mmj Nj [Mk/^Xii ^ 
case M of {Ki xi • • • ^m+n+i ^ 



( Casej).rr ) 



Fig. 1. Natural semantics for may-convergence, instrumented with costs. 



A more involved example is the “bottom-avoiding” merge operator defined thus: 



/ case xs of \ 




f case ys of \ 


[] 


|d 


[] 


y z : zs z : mzs ysj 




z : ZS-* z : mxs zs 1 



It non-deterministically merges two possibly infinite lists. If one of the two lists 
is finite, every element of the other appears in the output. 

The natmral semantics rules for may-convergence behaviour is given in fig- 
ure 1. We write M Jj." y to mean that closed term M may converge to value 
V with cost n. The rules in figure 1 describe a leftmost-outermost evaluation 
strategy to weak head normal form amd axe straightforward. The constructor 
and case rules specify that constructors are lazy in our language. 

The cost measures the number of function application steps and constructor 
eUmination steps in the computation. These steps involve general substitutions. 
The cost measure is motivated by technical requirements in the development of 
the cost-sensitive theory later on. Note that we cam recover the usual definition 
of may-convergence easily: M i)-V = 3n.M IJ." V. We also write Mi), to mean 
that there exists some value V such that M i)-V. 

The natural semantics rules for must-convergence behaviour is given in fig- 
ure 2. We write MF to mean that closed term M must converge at cost a. The 
cost is an ordinal which is, roughly, the “supremum” of the costs of converg- 
ing computations that witness that M must converge (they form a recmrsive, 
countably branching tree emd a is measured as the height of this tree, hence a 
is a recmsive ordinal; cf. [1,11]). We C£in recover simple must-convergence thus: 
Mi = Ba.Mp. 

The two (Am6|a) rules distinguish amb from erratic choice, and this high- 
lights the need for a description of must-convergence behaviour. For an erratic 
choice to always converge, both branches must always converge. For McCarthy’s 
amb to always converge, it is sufllcient that at least one operand does so. In [13] 
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Xx.Mi° {Lami » ) ( Constr^« ) 



JV1°' AT1“2 

MQArr ) M0ATi“ ^ MopA-4.“i+“= ^ ^ 



M(fixMH“ . 

75 — {^“ 4 .“) 



M4.“ VV.M i^V =i> V = Xx.M' A M'[-^V/ 3 ,] 4 .“v 

(-^PPXOt ) 



fix M|“ M av'+i)+a 

Mr Vy.M ^ V =J. V = KjMi-.-Mm^ A Nj FVx jr=i^“^ 
case M of {Ki xi • • • - M};Lj4.(Uv^ av+i)+« 



(Case4,o) 



Fig. 2. Natural semantics for must-convergence, instrumented with costs. 



the 4- predicate is shown to be an accurate description of the must convergence 
behaviour of the fair parallel evaluation semantics of amb alluded to in section 1. 

When deciding whether or not M not only must M converge, but it 
may only converge to lambda expressions. A similar restriction affects the rule 
for case expressions. This is due to the fact that we are working with an untyped 
language. The rules (App^^) and (C'osej.a) are infinitary, since in each case M 
may converge to countably many different values. 

4 The Equational Theory 

In this section, we first define a Morris-style contextual equivalence for the lan- 
guage, faithful to both may-convergence and must-convergence behaviour. We 
then establish a collection of equational laws for contextual equivalence by means 
of a simpler, more tractable equivalence relation, called Kleene equivcilence. 

In the following definition, C ranges over program contexts relative to M and 
N, that is, terms C with an occurrence of a hole [•] such that the terms C[M] 
and C[AT], obtained by filling in M and N for [•], Eire closed. 

Definition 1. We define contextual equivalence, by 

M^N = VC. ( C[M]JI ^ C[iV]ll A C[M]i C[AT]4. ) . 

If we read the may-converge predicate as a simple partial correctness assertion 
and the must-converge predicate as a simple total correctness assertion, the 
two conjuncts in the definition assert that contextual equivalence respects both 
partial and total correctness {i.e., closed contextually equivalent terms satisfy 
the same correctness assertions). The quantification over contexts makes con- 
textual equivalence the greatest such relation which is closed imder contexts: 
M^N C[M] ^ C[iV]. 

Contextual definitions are useful for showing two terms to be distinct (all 
that is needed is a witnessing context that exhibits different behaviour for each 
term), but cumbersome to use for proving equivalence, since the definitions in- 
volve quantification over all contexts. For this reason, we seek more tractable 
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characterisations of contextual equivalence, or at least, sound approximations 
that yield Vcilid methods for establishing contextual equivcilences between terms. 
We shall pursue this quest in greater depth in the next sections, but first let 
us introduce a simple-minded approximation to contextual equivalence, called 
Kleene equivalence. It relates terms with identical convergence behaviour: 

Definition 2. We define Kleene equivalence, ss, between closed terms by 
MkN = {W. Ml^V Nii-V) A Mi ^ Ni. 

For open terms, M pe N if and only if Ma « Na for all closing substitutions a. 

It is often simple to verify that two terms are Kleene equivalent, and surpris- 
ingly many fundamental equational laws, e.g. /3-laws, are instances of Kleene 
equivalence. 

Om first major result is that Kleene equivalence is included in contextual 
equivalence: 

Lemma 1. « C =. 

The proof is postponed to the next section. 

We note that In particular, w is not closed under contexts, e.g. M « 

N implies Xx.M « Xx.N only if M and N are identical, whereas Xx.M S Xx.N 
holds whenever M ^ N\ and similarly for constructed values. 

Some selected equational laws for call-by-name and McCeirthy’s amb are: 

{Xx.M)N^M[N/^] {fi) 

case Kj Ml . • • Mm, of {K, Xi---Xm,~* [^>^/xkC, (case-/3) 

fixMSM(fixM) (fix-/3) 

case J? of {A'i a:i • • • Xm, ^ {case-strict) 

M 1] M ^ M (D-idem) 

M B iV S iV D M (B-comm) 

(LBM)BW^Lfl(MBW) (B-assoc) 

M B f? = M (B-f2) 

These are all justified by lemma 1. As an example, we prove the validity of 
(/?). It suffices to prove it for closed terms Xx.M and N\ it then follows easily 
for open terms as well. By {App^^n), whenever {Xx.M) N may converge to some 
may also converge to the same V (and vice versa). By {App^^a), 
whenever {Xx.M) N must converge, so must M[^/j.] (and vice versa). Therefore 
{Xx.M) N « M[^/x], and, by lemma 1, {Xx.M) N = M[^/,^. The other laws of 
the equational theory follow similarly. 

A few other properties of = can be derived from its definition: it is an 
equivalence relation (reflexive, transitive and symmetric), and it is closed un- 
der contexts. These properties combined with {0) entail that = is substitutive: 
M^N A M'^N' M[M'/^] ^ N[N'/x\. 
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5 Simulation 

In the previous section we saw several good uses of Kleene equivalence for rea- 
soning about contextual equivalence. But Kleene equivalence is too discriminat- 
ing when considering programs which may-converge to higher-order values, i.e. 
lambda expressions or (lazy) constructed values. For instance, it cannot be used 
to establish any interesting algebraic laws for merge. The problem is that equiv- 
alent terms are required to may-converge to the same set of values. If we relax 
this and instead require that equivalent terms may-converge to suitably related 
sets of values, we arrive at forms of (bi)simulation equivalence. These are the 
subject of the remainder of the paper. 

In this section we introduce notions of simulation and bisimulation. We state 
open problems regarding the relationship between (bi) simulation equivalences 
and contextual equivalence, and we prove that contextual equivalence includes 
Kleene equivalence. 

First, we define a useful relational operator called compatible refinement [4]. 
Given a term relation R, its compatible refinement, R, relates two terms M and 
N if they have the same outer syntactic constructor, the components of which 
are pair-wise related by R. For example, the definition of compatible refinement 
between values is: (1) ’’n’’ R '"n”', for cill n; (2) Xx.M R Xx.N, if M R N-, (3) 
K Ml • ■ ■ Mn R K Ni • ■ • Nn, if Mi R Ni for I ^ i ^ n. A relation R is compatible 
if RC R. Every compatible relation is closed under contexts. A congruence is a 
compatible equivalence relation. 

The first use of compatible refinement is in the following definition. 
Definition 3. A relation S between closed terms is a simulation when 

M S N closing a. VV.Nai^V BU.Mai^UAUSV 

A M(j\. =>■ Ncr],. 

For example, Kleene equivalence, », is a simulation. 

The greatest simulation (the union of all simulations) is a pre-order (reflexive 
and transitive). Let mutual similarity be the induced equivalence relation. The 
question of the congruence of mutual similarity (or of bisimilarity, the great- 
est symmetric simulation) is an open one, and has been studied extensively by 
Corin Pitcher and the authors. If this question is answered affirmatively, mutual 
similarity will be included in contextual equivalence. In particular, it will entail 
that « is included in contextual equivalence, lemma 1, because w is a symmetric 
simulation and, hence, is included in mutual similarity. 

As mentioned in section 3, amb is distinguished from erratic choice by its 
must-convergence behavioiu:, imposing quite different obligations on implemen- 
tations. This may explain why the seemingly small difference in the natural 
semantics rules defining must-convergence makes amb significantly more dif- 
ficult to reason about. For erratic choice, and even countable choice, a col- 
lection of (bi)simulation equivcilences that respect both may-convergence and 
must-convergence behaviour have been proven congruent by variations of Howe’s 
method [7, 17, 11]. These proofs fail for amb, for the following technical reason. 
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In the course of the congruence proofs one shows that a suitable “congruence 
candidate relation” preserves must-convergence, i.e., for all related terms M and 
N, M\. => A^4-) which is established by rule induction on M\.. For the last 
three must-convergence rules in figure 2 we find that we need to know about 
the set of outcomes of may-convergence of the left-most subterm of M. So the 
induction hypothesis must be suitably strengthened to carry information about 
may-convergence. This works well for the natural semantics of erratic choice and 
countable choice, but not for amh. In the {Amhi^ ) rules the induction hypothesis 
only tells us about one branch, M\ say, of a choice expression M\ [] M 2 , sufficient 
for deducing the desired properties with regard to must-convergence of M\ [] M 2 
which follows from one branch only, but insufficient for the set of outcomes of 
may-convergence of Mi [] M 2 which depends on both branches. 

One possible solution is first (1) to show an appropriate relationship be- 
tween the may-convergence behaviour of related terms by rule induction on 
may-convergence judgements, and then (2) use this to prove the desired relation- 
ship between must-convergence behaviours by rule induction on must-converge 
judgements. But it turns out that the inherent asymmetry in Howe’s construc- 
tion of the congruence candidate relation prevents us from linking up (1) and 
(2). However, the simple nature of Kleene equivalence can be exploited to carry 
out this programme in the proof of lemma 1, C S, if we replace the congruence 
candidate by a different, symmetric construction, denoted by It is defined 
inductively thus: 

M^M' M ^ N 

M[N/^] «sc M'[N'/^] M n 

where has the obvious meaning, is the smallest compatible and 

substitutive relation containing ss. We can prove that is a simulation by 
rule induction on may-converge and must-convergence judgements; the symme- 
try of is crucial in linking up the may- and must-parts of the proof. It follows 
(by symmetry) that only relates terms with identical must-convergence be- 
haviour. As is also closed under contexts, it is included in =. By transitivity, 
we conclude that « C =. 

6 Cost Equivalence 

We will now present a cost-sensitive form of simulation, following Sands’ im- 
provement theory for deterministic languages [18] (related to the notion of “ex- 
pansion” in concurrency, see e.g. [2,20]). We are able to show that the induced 
simulation equivalence, known as cost equivalence, is a congruence. This leads to 
the introduction of a tick algebra as in [18], and finally a proof rule for recursive 
programs known as unique fixed point induction. 

To define a cost-sensitive simulation, we use the cost attributes of JJ- and 
4 .. This change will enable us to show the congruence of the induced mutual 
similarity and to establish useful proof rules for it. 
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Definition 4. A relation S between closed terms is a cost simulation when 

MSN V closing a. Vn.V. iVtr F 3t/. Mct [/ A [/ 5 V 

A Va. Mai“ 

where and 4,^“ have the obvious meanings. 

We denote the resulting mutual cost similarity equivalence relation, or cost equiv- 
alence, by =. 

In the remainder of this section we develop the equational theory of = and we 
establish bisimulation proof rules for reasoning about =. First, we estabhsh some 
basic equational laws for use in equational reasoning in later examples. Similarly 
to the equational theory of = in section 4, most of the laws for = equate Kleene 
equivalent terms but now only those with identical costs. Such terms are easily 
seen to be cost equivalent (the “cost Kleene equivalence” relation between such 
terms is a symmetric cost simulation zmd, hence, is included in ^). 

With the exception of (0) and (case-P), the laws presented in section 4 are 
valid for = in place of =. Where it does not lead to confusion, we will refer to 
the = variants by the same name as their = counterparts. 

It is often the case that two terms are “almost” cost equivalent; that is, they 
always differ in may- and must-cost by the same fixed amount. For example, 
(Xx.M) N and M[^/x] always differ by exactly one unit of cost. If we had some 
syntactic way of slowing the right-hand side down, we could write the relationship 
between the two as a cost equivalence. To this end, we introduce the “tick”, 
written /, which we will use to add a dummy step to a computation. Now we 
can write: 



{Xx.M)N='M[N/^] i/-p) 

There is a similar law for case expressions. We can define / within the language 
by superfluous application: '^M = I M. Clearly, / adds one unit to the cost of 
evaluating M without otherwise changing its behaviour. Note that by (/?): 

■^M = M (erase-/) 

There are a number of useful and easily verified laws for rearranging ticks, 
including: 

'(M0N) = ('M)U'N) 

('M)N ^-^(MN) 
case 'M of {Kj Xi---Xm,^ ^J”=i 

= '^case M of {K< xi---Xm,^ 

The equational laws are collectively known as the tick algebra. 

The definitions of the cost measures and of cost equivalence are carefully 
designed to enable us to prove that cost equivalence is a congruence. 

Theorem 1. ^ is a congruence. 



(/ -^-dist) 
(/ -float- apply) 

(/ -float- case) 
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This in turn entails that = is included in The congruence proof is inspired by 
the proof of congruence of a cost equivalence relation for a deterministic call-by- 
value language in [9]. Again, the proof follows the outhne of Howe’s method but, 
here, the congruence candidate relation is, basically, the smallest compatible and 
transitive relation containing This is a symmetric construction, a fact that 
plays a crucial role in the proof, as it did for lemma 1. 

The original congruence proof in [9] was designed to facilitate the derivation 
of Sangiorgi’s bisimulation up to context proof rule [19] and, eventually. Sands’ 
improvement theorem [18] for cost equivalence. We now present analogous results 
which can be derived from the congruence proof for our language. 

The following proof rule is very useful indeed; it allows us to prove equiv- 
alences between recursive programs. Sands has a similar rule in his call-by- 
name improvement theory [18], which he calls improvement induction, and [14] 
presents an improvement induction principle for deterministic call-by-need. We 
dub this proof rule unique fixed point induction. 

Theorem 2 (Unique Fixed Point Induction). For all term contexts C, the 
following is a valid proof rule: 

M = '^C[M] N = 'C[N] 

M = AT 

With the help of unique fixed point induction, it is no longer necessary to prove 
something about M’s and N’s behaviour in all contexts to prove them contex- 
tually equivalent. In most cases, it is enough to exhibit a single context and 
perform some simple equational reasoning. 

We have used unique fixed point induction in combination with the (erase-/) 
law and the inclusion of cost equivalence in contextual equivalence to prove many 
standard examples of contextual equivalence from deterministic languages, for 
example the dinaturality law for fixed points emd the many monad laws for 
streams from e.g. [22,5]. 

7 Bottom-Avoiding Merge 

We now prove that the implementation of merge is bottom-avoiding: 

merge rs 1? = xs -H- 12. 

Here, -H- is standard list concatenation, or append, and 12 is the always divergent 
term. Let C be the (variable-captming!) context case xs of {[]-> 12 | 2 : xs -> 
2 : [•]}. Unfolding merge xs 12 and simplifying with {case-strict) and ([]-12) yields 

merge xs 12 = ^^{!,\merge xs 12] 

and unfolding xs-ft-12 gives us xs-H-12 ’'^C[xs -H- 12]. The desired result 

follows by unique fixed point induction and the inclusion = C =. 

Merge is also commutative, associative amd has the empty list as left and 
right unit. See [13] for proofs of these properties and other examples. Similar 
lemmas may be found in [8], but there one needs to “take limits” to include 
streams. Streams are included here without any extra effort on om part. 
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Abstract. We study two dynamical properties of linear D-dimensional 
cellular automata over Zm namely, denseness of periodic points and topo- 
logical mixing. For what concerns denseness of periodic points, we com- 
plete the work initiated in [9], [3], and [2] by proving that a linear cellular 
automata has dense periodic points over the entire space of configurations 
if and only if it is surjective (as conjectured in [2]). For non-surjective 
linear CA we give a complete characterization of the subspace where pe- 
riodic points are dense. For what concerns topological mixing, we prove 
that this property is equivalent to transitivity and then easily checkable. 
Finally, we classify linear cellular automata according to the definition 
of chaos given by Devaney in [8]. 

Keywords: cellular automaton, discrete time dynamical system, chaos 
theory. 



1 Introduction 

Cellular Automata (CA) are dynamical systems consisting of a regular lattice 
of variables which can take a finite number of discrete values. The global state 
of the CA, specified by the values of cdl the variables at a given time, evolves 
according to a global transition map F based on a local rule f which acts on 
the value of each single variable in synchronous discrete time steps. A CA can 
be viewed as a discrete time dynamiccd system (X, F) where F : X — t X is the 
CA global transition map defined over the configuration space X. CA have been 
widely studied in a number of disciplines (e.g., computer science, physics, math- 
ematics, biology, chemistry) with different purposes (e.g., simulation of natural 
phenomena, pseudo-random number generation, image processing, analysis of 
universal model of computations, cryptography). For an introduction to the CA 
theory see [10]. Despite their simplicity that makes it possible a detailed alge- 
braic analysis, linear CA over Zm (CA based on a linear local rule) exhibit many 
of the complex features of general CA. For a complete and up-to-date reference 
on applications of Unear CA see [4]. 

Several important dynamical properties of linear CA, e.g., ergodicity, transi- 
tivity, sensitivity to initial conditions, and expansivity, have been studied during 
the last few years and in many cases exact characterizations have been obtained 
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(see for example [11,16,3,13,15,2]). In [14] the authors investigate and com- 
pletely characterize the structure of attractors for D-dimensional linear CA over 
Zm, while in [7] the authors gives a closed formula for computing their Lyapunov 
exponents and their topological entropy. 

In this paper we study two important dynamical properties of CA: denseness 
of periodic points and topological mixing. We first investigate the structure of 
the set of periodic points of linear CA. In particulm we focus om: attention on a 
problem addressed in [2] where the authors prove that for 1-dimensional linear 
CA smjectivity is equivalent to have dense periodic points over the entire space 
of configmrations leaving open the problem of characterizing this last property 
in the D-dimensional case. Then, we completely characterize topological mixing 
for linear CA in terms of the coefficients of their local rule. 

The main contribution of this paper can be summarized as follows. 

- We prove (Theorem 2) that for linear D-dimensional CA over (D > 1, m > 
2) surjectivty is equivalent to have dense periodic points (implicitly characteriz- 
ing this last property). 

- For non-surjective linear D-dimensional CA over Zm we explicitly characterize 
(Corollary 1) the largest subspace where periodic points axe dense taking advan- 
tage of the results obtained in [14] on the attractors of linear CA over Zm- 

- We prove (Theorem 5) that for linear D-dimensional CA over Zm transitivity 
is equivalent to topological mixing (implicitly characterizing this last property). 

- We completely characterize (Corollary 2) the class of chaotic linear D-dimen- 
sional CA over Zm {D > 1, m > 2) according to one of the most popular 
definition of chaos, that given by Devaney in [8]. 

2 Basic definitions 

2.1 Cellular automata 

For m > 2, let Zm = {0, 1, . . . ,m - 1}. We consider the space of configurations 

C^^{c\c: Z^^Zm} 

which consists of all functions from into Zm- Eeich element of can be 
visualized as an infinite D-dimensional lattice in which each cell contains an 
element of Zm- Let s > 1. A neighborhood frame of size s is an ordered set of 
distinct vectors ui,U 2 , - -. , Ug € Z®. Given / : Z^ -> Zm, a D-dimensional CA 
based on the local rule f is the pair {Cm,F), where F: Cm is the global 

transition map defined as follows. 

[F(c)](u) = /(c(n-f Ui),...,c(n-f Ug)) where c e C,^, u S Z^. (1) 

In other words, the content of cell v in the configmation D(c) is a function of 
the content of cells u -I- Ui, . . . , v + tig in the configuration c. Note that the local 
rule / and the neighborhood frame completely determine F- 
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In order to study the topological properties of £)-dimensional CA, we introduce 
a distance over the space of the configurations. Let A : Zm x -> {0, 1} defined 
by 




if i - j, 
if i j. 



Given a,b the Tychonoff distance d(a, b) is given by 



d(a, b) 



E 

«€Zr> 



/\(a(n),6(n)) 

2ll"IU 



( 2 ) 



where, as usual, ||w||oo denotes the maximum of the absolute value of the compo- 
nents of w. It is easy to verify that d is a metric on and that the metric topol- 
ogy induced by d coincides with the product topology induced by the discrete 
topology of Zm- With this topology, is a compact and totally disconnected 
space and F is a (uniformly) continuous map. 

Throughout the paper, F{c) will denote the result of the application of the 
map F to the configiuration c, and c{v) will denote the value taken by c in n. For 
n > 0, we recursively define F”(c) by F”(c) = F(F"~^(c)), where F^{c) = c. 
Let {C^, F) be a CA based on the local rule /. We denote by the local rule 
associated to F". 



2.2 Linear CA over Zm 



In the special case of linear CA the set Zm is endowed with the usual sum 
and product operations that make it a commutative ring. In what follows we 
denote by the integer x taken modulo m. Linear CA have a local rule of the 
form /(xi,... ,Xg) = Ei=i with Ai,... ,Ag G Z„. Hence, for a linear 
£>-dimensional CA Equation (1) becomes 



[F(c)](n) = 



^ \iC{v + Ui) 



Li=l 



where 



cGC^, vGZ 



D 



( 3 ) 



2.3 Topological properties 

In this section we recall the definitions of some topological properties which de- 
termine the qualitative behavior of any general discrete time dynamical systems. 
Here, we assume that the space of configurations X is equipped with a distance 
d and that the map F is continuous on X according to the topology induced by 
d. 

Definition 1 (Transitivity). A dynamical system (X, F) is topologically tran- 
sitive if and only if for all non empty open subsets U and V of X there exists a 
natural number n such that F'^{U) fl V / 0. 
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Intuitively, a transitive map F has points which eventually move under iteration 
of F from one arbitrarily small neighborhood to any other. As a consequence, 
the dynamical system cannot be decomposed into two disjoint open sets which 
are invariant under the iterations of F. 

Definition 2 (Topological mixing). A dynamical system {X,F) is topologi- 
cally mixing if and only if for all non empty open subsets U and V of X there 
exists a natural number uq such that for every n> no we have F"^{U) C\V 

It is obvious that topological mixing implies transitivity. 

Definition 3 (Strong transitivity). A dynamical system {X,F) is strongly 
transitive if and only if for all nonempty open setU C X we have F^{U) — 
X. 

A strongly transitive map F has points which eventually move under iteration 
of F from one arbitrarily small neighborhood to any other point. 

Definition 4 (Denseness of periodic points). Let 

P{F) = {x € A I 3n € N : F"(x) = x} 

be the set of the periodic points of F. A dynamical system (A, F) has dense 
periodic orbits if and only if P{F) is a dense subset of X, i.e., for any x £ X 
and e> 0, there exists y € P(F) such that d{x, y) < e. 

Denseness of periodic orbits is often referred to as the element of regularity a 
chaotic dyncimical system must exhibit. The popular book by Devaney [8] isolates 
three components as being the essentiEil features of chaos: transitivity, sensitivity 
to initial conditions and denseness of periodic orbits. 

3 Properties of linear CA 

A D-dimensional cylinder ((vi,ai), . . . , (vi,ai)) is a particular subset of de- 
fined as 



{{vi,ai),...,{vi,ai)) ^{xeC^ | x(i7i) = Uj, i = 1,...,/}, 

where and Vi € Z^. Note that cylinders form a basis of closed and 

open subsets of according to the metric topology induced by the Tychonoff 
distance. 

We first recall a result proved in [13] which holds for strongly transitive 
linear CA and states that for every cylinder c C it is possible to find a 
natural number tc such that, with a little abuse of notation, F*^(C) = C^, i.e., 
every configmation of can be reached after exactly tc iterations of the map 
F starting from one element of C. 
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Lemma 1 ([13]). Let F be a strongly transitive linear D- dimensional CA over 
C^. Then for every cylinder C C there exists a natural number tc such that 

VxeC^ 3ceC: F‘<^(c)=x. 

Next lemma states that any linear D-dimensional CA over with p and q rel- 
atively prime is topologically conjugated to the map G = ([-P’jp > (where, 

for every c C we define [Fjp (c) = [F(c)]p.) We say that two dynamical sys- 
tems (X, F) and {X', F') are topologically conjugated if there exists a bijective 
function ip: X -¥ X' such that ip{F{x)) = F'{ip(x)) and both ip and ip~^ are 
continuous (that is, ip is a homeomorphism between X and X'). If (X,F) and 
(X', F') are topologically conjugated then they share the same topological prop- 
erties, that is, (X, F) satisfies a given topological property if and only if (X', F') 
satisfies it. 



Lemma 2. Let F be a linear D-dimensional CA over with m = pq and 
gcd(p, g) = 1. Then F is topologically conjugated to the map G : Cp x —¥ 
Cp X defined by 

G{xi, X 2 ) = ([Fjp (xi), [F]^ (X 2 )) where xi G Cp, X 2 G Cf . 

Proof. Due to limited space, the proof of this theorem is omitted here and will 
be given in the full paper. 

Lemma 2 will be useful to prove both theorem 2 and theorem 5. 

Lemma 3 ([7]). Let F be a linear D-dimensional CA over with local rule 



/(xi,... ,Xs) = I^AiXi 
.i=l 

and neighborhood vectors . ,Ug. Define 

I = {i\ gcd(Ai,p) = 1}, / = 



^ ^ A^Xj 
,i€I 



and let F be the global map associated to f. Then, there exists h > 1 such that 
for all c G we have F^{c) — F*(c). 

Let F be a siurjective linccur D-dimensional CA over C^. We call F a shift-like 
CA of radius r if and only if there exist A G Zm and u G with ||u||oo = r 
such that 



[F(c)](u) = [A c{v u)]^ where c G C,^, u G Z^. 

Note that shift-like CA are surjective by definition and then from the charac- 
terization of siurjective linear CA given in [11] we conclude that A and m are 
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relatively prime. Since = 1 (where y? is the Euler function), we con- 

clude that 

_ X‘^^”'^c(v + tp{m)u) — c{v + ip(m)u) 

and then is a true shift CA if r > 0, the identity CA if r = 0. 

In view of the above considerations, the dynamical behavior of shift-like 
CA can be easily analyzed. In particular, shift-like CA with radius zero are 
equicontinuous and then not topologically transitive, while shift-like CA with 
radius greater than zero are topologically mixing and then transitive but not 
strongly transitive. Finally, all shift-like CA have dense periodic points. 

Lemma 4. Let F he a surjective but not strongly transitive linear D-dimensional 
CA over Cpk with local rule ,x«) = Ei=iAiXi]pk. Then there exists a 

positive integer h such that the map F* is a shift-like CA. 

Proof. Since F is not strongly transitive then for every pair Aj, Xj of coefficients 
we have that p divides at least one of them. Since F is surjective then there 
exists at least one Aj such that gcd(Aj,p) = 1. As a consequence of the above 
considerations, the set I = {i \ gcd(Aj,p) = 1} contains exactly one element. 
The thesis follows from Lemma 3. 

4 Periodic points for surjective linear CA 

In this section we prove that for linear CA over Zm smjectivity implies denseness 
of periodic points. Since the converse implication was already proven to be true 
in [2] (for general CA), we conclude that surjectivity is equivalent to denseness 
of periodic points. 

To this end we need the definition of permutive map. A map f : -> A is 

permutive in the variable X{ if for any a, oi, . . . , Oj-i, Oj+i, . . . , Og belonging to A 
there exists x & A such that /(ai, . . . , a<_i, x, Oi+i, ..., Og) = a. In other words, 
/ is permutive in the variable Xj if we can force / to output an arbitrary element 
of A by acting on the variable i, independently of the values taken by the other 
variables. We say that / is leftmost (rightmost) permutive if it is permutive in 

Xi (Xg). 

Lemma 5. Let F be a topologically transitive 1-dimensional CA with local rule 
f defined over A^, where A is a possibly infinite alphabet. Let ui,. . . ,Ug be the 
neighbor vectors (reals in this case) associated with variables xi,... ,Xg of f. 
Assume that Ui < 0 and tig > 0. Moreover, assume that f is permutive in the 
variable x\ and Xg. Then F has dense periodic points over A^. 

Proof. (Sketch) Let w £ be any finite configuration of length 2k+l, where 

fc is an arbitrarily chosen positive integer. Since F is topologically transitive then 
there exist n £ N and Vq, Wq £ A^ such that 



Vo = ■ • ■ Ol2aiW-k ■■■Wo--- Wk^i^2 • • • 
Wo = • ■ • aja'iw-fe • • • Wo ■ • • Wfe/3i/?2 • • • 
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and F"(Vo) = ^o- Note that w is centered at the origin of the lattice, i.e., 
Vo(0) - IPo(O) = Wo- 

Since F is leftmost and rightmost permutive and ttj < 0 and «« > 0 h takes 
a little effort to verify that it is possible to find a configuration Vi such that; 

- ViiJ) = VoU) for |j| < A: + 1, 

- F"(Fi) = Wi, and 
-W^i(j) = V^i(j)for |j|<A:+l. 

By repeating the above procedure we are able to construct a sequence of 
pairs of configurations {Vi,Wi) such that F”(Vi) = Wi and Vi{j) — Wi{j) for 

j — —i — k, ... ,k + i and i = 1, 2 , 

We have 



lim Wi^ ]im Vi = W and F"(W) - W. 

*—►00 t— >oo 

Since w can be arbitrarily chosen, we conclude that F has dense periodic orbits. 

Theorem 1. Let F be a linear D-dimensional CA over If F is strongly 
transitive then it has dense periodic points. 

Proof. (Sketch) Let F be the global transition map defined in Lemma 3 (we 
recall that F is based on the local rule / obtained from / by removing all the 
coefficients that are not prime with p). iProm Lemma 3 we have that if F has 
dense periodic points then F has dense periodic points. As a consequence of this 
fact, we may now assume that all the coefficients of F (at least two of them are 
non-zero since F is strongly transitive) me prime with p. We prove the thesis 
by induction on D. We know that in the 1-dimensional case the thesis is true 
(see [2]), we assume the thesis in dimension £> - 1, and we prove it in dimension 
D. 

We proceed as follows: we note that every £)-dimensional linear CA F over 
Cpk can be seen as a 1-dimensional hnear CA consisting of the sum of a finite 
number of (D— l)-dimensional CA each of them defined over the infinite alphabet 
C^k~^. In other words we take the D-dimensionaJ space and we split it (along 
a suitable D — 1 dimensional hyperplane) into £> — 1 dimensional “slices”. The 
new global transition map we are going to consider will take a certain number 
of shces as input and will produce a new slice as output. 

Slicing takes place according to the neighbor structure of P as follows. If 
all the neighbor vectors lay on the same hyperplane H {in D — \ dimensions), 
then we slice the space according to H. In this case the 1-dimensional CA F* 
we obtain is based on a local rule that depends on 1 variable with null neighbor 
vector. In other words each slice evolves independently of any other slice and 
its evolution is governed by a linear strongly transitive {D — l)-dimensional CA 
over C^k which has dense periodic points by induction. We conclude that F* has 
dense periodic points. 

If the neighbor vectors cannot be covered by a unique hyperplane, then we 
can always choose a £) — 1 dimensional hyperplane H such that: if Ui belongs 
to H then there exist Uj and Uj which stay on opposite sides of H. Slicing 
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according to H we obtain a 1-dimensional CA F* which satisfies the hypothesis 
of Lemma 5 and then we conclude that F* has dense periodic points. 

It remains to be proven that surjective but not strongly transitive linear CA 
have dense periodic points. 

Theorem 2. Surjective linear D -dimensional CA over have dense periodic 
points. 

Proof. Let m = pf ^ • p^" be the prime factor decomposition of m. Let mi = 
pT^ 1 < i < n. iProm (a repeated application of) Lemma 2 we have that F 
has dense periodic points if and only if [P]^. has dense periodic points for every 
1 < i <n. li [P]^. is strongly transitive then in view of Theorem 1 it has dense 
periodic points. Assume now that [P]^. is not strongly transitive. Since F is 
surjective, [F]„. must be siurjective and in view of Lemma 4 we conclude that 
there exists a positive integer h such that [P]^. is a shift-like CA. Since shift-like 
CA have dense periodic points we obtziin the thesis. 



5 Periodic points for non-surjective linezir CA 



In this section we study the distribution of periodic points of non-surjective linear 
CA with the aim of understanding which points of can be approximated with 
arbitrary precision by periodic points. To this extent, we take advantage of the 
theory of attractors applied to linear CA developed in [14]. In [14] the authors 
prove that for any non-surjective linear CA F, there exists a subspace Yp such 
that, for any configuration x, F*(x) € Yp for all k > [log 2 mJ. That is, after 
a transient phase of length at most [log 2 mj , the evolution of the system takes 
place completely within the subspace Yp. This result indicates that, in order 
to study periodic points of non-surjective linear CA, one should analyze the 
behavior of the map F over the subspace Yp. In addition, they prove that the 
behavior of F over Yp is identical to the behavior of a linear surjective map F* 
defined over a configuration space isomorphic to Yp. 

Let F denote the global treinsition map of a non-surjective linear D-dimensional 
CA over defined by 



[D(c)](u) 



\iC{v -I- Ui) 



(4) 



Let d = gcd(m, Ai, . . . ,Ag). Since F is not surjective we know that d > 1. 
Let m = p'\P' 2 "'P'n. Without loss of generality we can assume that d — 
P '1 P '2 '"PT 1 < and I <n. Let 



q=pf ■■■pf. 



(5) 



and define 

Yp = {c € \ [c{v)] = 0, Wv e Z^} and m* = —. (6) 

The following theorem is a combination of Theorems 3.1 and 3.2 of [14]. 
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Theorem 3 ([14]). Let {C^,F) denote a non-surjective linear CA. Let Yp and 
m* be defined as in (6). Then 

(a) for any c e and k > [log 2 mJ, we have F'^{c) G Yp and 

(b) the subsystem {Yp, F) is topologically conjugated to the surjective linear CA 

Taking advantage of Theorem 3 we can prove the main result of this section. 

Corollary 1. Let {C^,F) denote a non-surjective linear CA. Let Yp be defined 
as in (6). Then 

(c) the periodic points of F are dense over Yp and 

(d) Yp is the largest subset of where F has dense periodic points. 

Proof. iProm Theorem 3 we know that after at most [log 2 m\ steps the evolution 
of (C^,F) takes place completely within the subspace Yp. This implies that all 
periodic points belong to Yp. In addition, the subsystem {Yp, F) is topologically 
conjugated to a surjective linear CA {C^., [P|m*) which, in view of Theorem 2, 
has dense periodic points over the entire C^.. Since topological conjugation 
preserves denseness of periodic orbits, we conclude that F has dense periodic 
points over Yp. 

Let X € he any configuration which does not belong to Yp. Then there 
exists a vector v € such that for every y £Yp we have x{v) ^ y{v) and then 
d{x, y) > We conclude that Yp is the largest subset of where F has 

dense periodic points. 

6 Topological Mixing for lineeir CA 

In this section we prove that topologiccd mixing and transitivity are equivalent 
properties as far as linear CA are concerned. 

Theorem 4. Let F be a linear D -dimensional CA over C^. If F is strongly 
transitive then it is topologically mixing. 

Proof. Let C C be any cylinder cind tc be the positive integer defined in 
Lemma 1. We have that C^. Since F is surjective, we have 

Vn>tc: F”(C)-C;^, 

that is, F is topologically mixing as claimed. 

Theorem 5. Transitive linear D- dimensional CA overC^ are topologically mix- 
ing. 

Proof. Let m — pl^ ■ ■ -p®" be the prime factor decomposition of m. Let m, — 
Pi'f 1 < i < n. i,Prom (a repeated application of) Lemma 2 we have that F 
is topologically mixing if and only if [F]„,, is topologically mixing for every 
1 < i < n. Assume now that [F]„j. is not topologically mixing. Since F is 
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transitive, must be transitive (and then surjective). In view of Lemma 4 

there exists a positive integer h such that is a shift-like CA with radius r. 

Since [F]^. is transitive r must be greater than zero. The thesis follows from the 
fact that shift-like CA with radius greater than zero are topologically mixing. 

Since topologically mixing CA are transitive by definition, we conclude that for 
hnear CA transitivity and topological mixing are equivalent properties. 



7 Chaotic behavior of linear cellular automata 



In this section we classify linear £)-dimensional CA over Zm {D > 1, m > 2) 
according to the Devaney’s definition of chaos. 



Definition 5 (Devaney’s Chaos). A dynamical system is chaotic according 
to the Devaney’s definition of chaos if and only if it is topologically transitive, 
it is sensitive to initial conditions, and it has dense periodic points. 



Let (C^,F) be a D-dimensional linear CA over Zm defined by 



[F{c)]iv) 



^ Xic{v + Ui) 
.1 = 1 



m 



where c e C^, v e Z^, 



(7) 



where, as usual, we assume IlniHoo = 0 and ||nt||oo > 0 for every 2 < i < s. Let 
V be the set of prime factors of m. We have the following results: 



(а) {Cm, F) is topologically transitive if and only if gcd(A 2 , . . . , As, m) = 1 (see 

[3]). 

(б) (Cm,F) is sensitive to initial conditions if emd only if there exists p & V 
which does not divide gcd(A 2 , . . . , A*) (see [13]). 

(c) {Cm, F) is surjective if and only if gcd(Ai, ... , As, to) = 1 (see [11]). 

(d) (C^, F) has dense periodic points if and only if it is surjective (this paper). 



As a consequence of (a) — (d) we have the following corollary. 



Corollary 2. Let {C^, F) be any D-dimensional linear CA over Zm based on a 
local rule f with coefficients Aj, . . . , A*. The following statements are equivalent. 
-{Cm,F) is chaotic according to the Devaney’s definition of chaos, 

-{Cm,F) is topologically transitive, 

-gcd(A 2 ,... ,As,to) = 1. 



Acknowledgments. I wish to thank G. Cattaneo and G. Manzini for many 
useful discussions. 
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Abstract. This paper discusses real-time language recognition by 1- 
dimensional one-way cellular automata (OCA) and two-way cellular au- 
tomata (CA), focusing on limitations of the parallel recognition power. 
We summarize the previous researches and investigate several languages 
to clarify the problems on real-time language recognition power of CA 
and OCA. It is shown that: 

1. The language {ww : w 6 {0, 1}'*'} cannot be recognized by OCA 
in real time (this proposition is derived from a pumping lemma for 
cyclic strings); 

2. LlThere are languages L C such that LS' and its reversal can 
be recognized by CA in real time but L is not recognizable by OCA 
in real time; and 

3. The language {w^w" : w € {0, l}"^, n > 1}, as well as its reversal, is 
recognizable by CA in real time. 

The last result denies an Ibarra and Jiang’s conjecture [8]. 

Key words. Cellular automata, one-way cellular automata, parallel lem- 
guage recognition, closure under reversal. 



1 Introduction 

As serial language recognition has been a fundamental subject in traditional au- 
tomata and computation theories, paxcillel language recognition is an essential 
theme in parallel computation theory. Among the models for parallel language 
recognition, cellular automata (CA), also called cellular array [7, 8], have advan- 
tages of simplicity and uniformity. Although the petreillel language recognition by 
CA has been investigated for more than three decades, many important problems 
still remain unsolved. 

This paper discusses real-time language recognition by 1-dimensional 2- and 
3-neighbor deterministic bounded CA, focusing on limitations of the parallel 
recognition power. Intuitively, we say that a CA recognizes a language L C S* 
in time t, if and only if there is the transition of configmations for each string 
w e such that: 
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1. The initial configuration (i.e., the sequence of cell states) equals to #u;#, 
where # is the imaginary boundary state; and 

2. The rightmost cell has an accepting state in the t-th configuration if and 
only if It) G L. 

A formal definition is given in Section 2. The CA recognizes the language in 
real time, or in linear time, if and only if the recognition time t is equal to 
|zt)| — 1 or c • (|it)| - 1), respectively, for a constant c > 1. The real-time language 
recognition is especially important, since languages are likely recognized in real 
time in human mind. 

A one-way cellular automaton (OCA) is a special CA such that the next 
state of each cell does not depend on the state of the right neighbor cell. The 
real-time language recognition models are illustrated in Fig. 1 and Fig. 2. We 
also discuss the refiective models of the CA and OCA — “left CA” and “left 
OCA,” respectively, in which the acceptance is decided by the leftmost cells. 
The language recognized by a CA is called the cellular automaton language 
(CAL), and that by an OCA the one-way cellular automaton language (OCAL). 
Note that in some literatures [7,8] the reversal of CAL, i.e. left CAL, is the 
standard and simply called CAL. 




accept or reject 



Fig. 1. Real-time recognition by CA. 




Fig. 2. Real-time recognition by OCA. 



Several language recognition models of cellular automata were first inves- 
tigated by Kasami et. al. [9] and Cole [4] in the late 1960’s. Smith III [11] 
first discussed real-time language recognition by CA in 1972. In 1980, Dyer [5] 
first discussed real-time language recognition power of one-way CA (OCA). The 
results of previous researches closely related to this paper are summarized as 
follows, where C{L) denotes the class of languages L. 
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1. C(CAL) is equal to C(context sensitive language) [9, 11]. C(OCAL) includes 
C(context free lainguage) [9]. 

2. C(real-time OCAL), as well as C(real-time CAL), includes non-context-free 
languages such as {a”h"c” : n > 1} [5, 11]. C(real-time CAL) includes com- 
plex languages such as {!” : n is a prime number} [6, 11]. 

3. Both C(real-time OCAL) and C(real-time CAL) are closed imder set opera- 
tions [11]. 

4. C(real-time OCAL) is closed under reversal, hence C(real-time OCAL) = 
C(real-time left OCAL) [3]. C(linear-time CAL) is also closed under reversal 
[ 11 ]. 

5. C(real-time CAL) = C(linear-time left OCAL) and C(real-time left CAL) = 
C(linear-time OCAL) [13,3]. 

Especially important problems are related to limitation of the real-time lan- 
guage recognition power of CA and OCA. In 1984, Choffrut cind Culik II [3] 
proved that C(real-time OCAL) is a proper subclass of C(real-time CAL) by 
showing that the language {1^" : n > 0} can be recognized by CA but not by 
OCA in real time. More recently, Terrier [15, 14] showed two languages which 
are recognizable by CA but not by OCA. One of the languages is {uvu : u,v E 
{0, 1}*, juj > 1}. The other is the language LlLl, where 

LI = {w : w = OT w = P0j/1(H, y 6 {0, 1}*, j > 0}. 

It is important that LlLl is a context free language. The existence of this lan- 
guage implies that C(real-time OCAL) is not closed under concatenation, since 
LI is a real-time OCAL. 

On pairallel recognition power of CA, the following fundamental problems 
have remained open since Smith III [11] first posed them in 1972. 

1. Is C(real-time CAL) closed under reversed, or equivalently, is C (real-time 
CAL) equivalent to C(real-time left CAL)? 

2. Is C(reed-time CAL) closed under concatenation? 

3. Is the real-time recognition power of CA equivalent to the Unear-time recog- 
nition power? 

4. Is the real-time recognition of CA equivalent to the Unear-time recognition? 

Mciny researchers have conjectured that the answers to these questions are 
negative [3,8]. Ibarra and Jiang proved the following propositions, that axe 
closely related to solving the problems [8]. 

— C(real-time CAL) is closed under reversal if Eind only if C(Unear-time CAL) = 
C(real-time CAL). 

— If C(real-time CAL) is closed under reversed, then the class is also closed 
under concatenation. 

— If CA is more powerful than OCA, nonUnear-time CA is more powerful than 
Unear-time CA. 

In this paper, we investigate severed languages to clarify the problems on 
real-time language recognition power of CA and OCA. It is shown that: 
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1. The Icinguage {ww :w € {0, 1}"*"} cannot be recognized by OCA in real time 
(this proposition is derived from a pumping lemma for cyclic strings); 

2. There are languages such that L c. E* and its revers 2 il can be recog- 
nized by CA in real time but L is not recognizable by OCA in reed time; 
and 

3. The language {w$w” ; w £ {0, l}+,ra > 1}, as well as its reversal, is recog- 
nizable by CA in real time. 

The last result denies the Ibarra and Jiang’s conjecture [8]. 

2 CA, OCA zind their languages 

A cellular automaton (CA) is a system S = {K, #, /, A), where: 

— A is a finite set of cell states-, 

— if: E K is the boundary state-, 

— a transition function, is a mapping from K x K x K into A — {#}; and 

— A C AT is a set of accepting states. 

An one-iuay cellular automaton (OCA) is a CA 5 = {K, fi,f,A) such that there 
exists a function f : K x K ^ K with f{x, y, z) = f'{x, y) for x,y,z £ K. 

A configuration is a string #u# of states with u £ (AT— {#})+. The transition 
function / is extended to that of configurations by 

/(#0i02 • • • a„#) = #616262 • • • 6„# 

where 6< = /(aj_i, a<, Ci+i) for all i, 1 < i < n, and oq = a„+i = #. The function 
/" of configurations is recursively defined by /°(c) = c and /"(c) = /(/"“^(c)), 
for all configuration c and n > 0. 

A CA 5 == {K,ff,f,A) recognizes a language L C i7* in a time t {by the 
rightmost cell), if and only if 27 C AT and for all ic £ |u;| > 2, 

w£L^ /'(#«;#) € mK^A {#}. 

The CA S recognizes L in real time, if cmd only if t = |ic| — 1. The CA S recognizes 
L in linear time, if emd only if there is a constant c > 1 with t < c - (|u)| — 1). 

In the real-time recognition, we can write 

/ (flifl2 ' ■ ■ ®n#) ~ 6j.).l • • ■ 6n# 

for all 1 < z < n — 1. For recil-time recognition by an OCA, we also write 
/’(aioz • • • a„) = bi+i • • • 6„. Note that the boundary state # is in fact not nec- 
essary for this case. 

The language recognized by the leftmost cell, CEiUed the left CAL or left 
OCAL, is defined similarly, and it is equivalent to the reversal of the language 
recognized by the rightmost cell. Since this paper mainly deals with the accep- 
tance by the rightmost cell, we simply call the CAL and OCAL instead of the 
“right CAL” and “right OCAL.” 

The following lemmas are straightforwmd from the definitions or well-known 
in this field. 
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1. The real-time recognition in time |u;| — 1 for any string w is equivalent to 
that in time |rt;|. This difference is important since the |u;| — 1 recognition 
does not require the left boundary symbol. 

2. For the linear-time recognition, n-times speed-up of recognition time can be 
applied to the CA, where n is any positive integer [11, 12]. 

3. For any real-time CAL L over E, E*L is a real-time CAL. 

4. For any real-time OCAL L over E, LE* is also real-time OCAL. 

3 Recognition of cyclic strings by OCA 

A cyclic string of any alphabet E is either a string of the form u* with u G E~^ 
and fc > 2 or a consecutive substring w of with |u^| < |ru| < |u*'|, k >3. The 
simplest cychc strings are 1*^ with k> 2. 

Proposition 1 (Pumping lemma for cyclic strings). For any real-time 
OCAL L, there exists an integer n such that for any string u and any integer k, 
ifu'^GL and k > nl“l then there is an integer m > 1 such that E L for 

all j > 1. 

Proof. Let 5 = {K, #, /, A) be an OCA which recognizes L. Let n be \K\. Then, 
there are a string u, integer k and strings wi,W 2 ,"' " with | = |u| such 

that 



and (u^) € A. Since the number of possible strings wj is bounded by 

the sequence Wi,W 2 ,‘ •• falls into a loop, if fe > There exist I and m with 
l,m < nl“l such that wi = wi+jm for all j > 1. This imphes that E L for 

all J > 1. □ 

This lemma can be used to show that severed languages are not real-time 
OCAL’s. The following proposition is a direct consequence of Proposition 1. 
Note that this proposition implies that the language {1^ : n > 0} is not a 

real-time OCAL, that is proved in [3]. 

Proposition 2. For any integer function g and any alphabet E, if g{n) > 0(n) 
then the language : w E E~^,n > 1} is not a real-time OCAL. 

It is not difficult to construct a CA recognizing the “double word” language 
{ww : w E Z"*", |i7| > 2} in real time^. On the other hand, it has not been 
shown whether this fundamental language is a real-time OCAL. Terrier [14] 
showed a necessary condition of real-time OCAL and proved that the language 
{uvu : u,v G {0, 1}*, juj > 1} is not a reed-time OCAL. This does not, however, 
imply that the double word language is not a real-time OCAL. 

^ A CA recognizing the double word language can be obtained by slightly modifying 
the CA for the language {uvu : u,v G {0, 1}*, juj > 1} by Terrier [14]. 
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Proposition 3. The double word language {ww : w g |i7| > 2} is not a 
real-time OCAL. 

In the proof of this proposition, we use the notion of fc (A: > 2) consecutive 
double words, which axe defined as consecutive substrings of cychc strings tt"* 
with the length |u^|-|-fc- 1. For exampie, strings 111 and 12121 are 2 consecutive 
double words, and 001001011 does not have two or more consecutive double 
words. Note that there are strings with arbitrary length which do not contain 
two or more consecutive double words, if |I7| > 2. 



Proof. Suppose that there is an OCA 5 = {K,ff,f,A) which recognizes the 
double word language in real time. Then, we can construct an OCA S' — {K x 
K, #, K X A) with two layers satisfying the followings 1 to 4. 

1. Every state of the first layer transits similarly as that of S. 

2. Every cell state in the second layer only transits according to the local func- 
tion /, when all the three neighbor cell have accepting states of S in the first 
layer. 

3. If any cell detects that three or more consecutive accepting states of A follow, 
and/or are followed by, a rejecting state, this cell emits a special signal to 
the rightmost cell to reject the input string. 

4. The accepting states of S' are those containing the accepting states of S in 
the second layer. 

Since the states in the first layer reach a accepting state of A in every 2|iy| - 1 
step. S' recognizes the language 

{u;4|tu|-i . g E~^,w does not contain 3 or more consecutive double 
words } 

in real time. The existence of this language contradicts the assumption that the 
double word languages are real-time OCAL’s by Proposition 1. □ 

4 The subclass of real-time CAL closed under reversal 

It is an open problem whether C(real-time CAL) is closed under reversal and/or 
concatenation, whereas C (real-time OCAL) has been shown to be closed under 
reversal but not closed imder concatenation [14], as stated in Section 1. In this 
section, we show two types of real-time CAL’s of which the reversals are also 
real-time CAL’s. 

A CA 5 = (K, #, /, A) generates a string 010203 ••■an € in real time, if 
and only if for an initial configuration cq = with n > 2, i-th configuration 
c* equals to UiOif^ for any Ui and for all 1 < i < n. 

Proposition 4. Let R be a relation R C K x E. If there is a CA which gen- 
erates a string 01O2 • • • o„ € in real time, the language {6162 ■ • • 6r» S E~^ : 
OiRbi, 1 < i < n} and its reversal are real-time CAL’s. 




226 Katsuhiko Neikamura 



Proof. From the CA S, we can construct two CA Si and S 2 which recognize 
the language and its reversals, respectively. The configurations in Si and S 2 
transit as the space-time diagrams in Fig. 3. In the right diagram for recognizing 
the reversal, each input symbol bi moves to right and the rightmost cell tests 
the relation biRat at Point x. When cdl the symbols match with the generated 
symbols, the rightmost cell has an accepting state. In the left diagram, the 
matching of bi with Oj is tested at Point x. The signal moving to right from 
every cell accumulates all the matching results and transfers an accepting signal 
to the rightmost cell, if all the matching succeed. □ 





Fig. 3. Transition in CA Si and S 2 recognizing languages by generation 



The unary real-time languages are closely related to those generated by CA. 
Ibarra and Jiang [8] showed that the class of unary real-time CAL’s is closed 
under concatenation and reversal. 

Proposition 5. For any unary real-time CAL L C {b}'*', there is a CA which 
generates V C with L = h{L') in real-time, where h : K {6} is a 
homomorphism. 

Proof. This is a direct consequence of the definition. □ 

Proposition 6. The language {6162 € {0,1}" : bi — 1, if i is a prime 

number, bi = 0, otherwise } and its reversal are real-time CAL ’s. 

Proof. Fisher [6] showed a CA which generates the string which is equivalent to 
the language of this proposition. □ 

The language given in this proposition is a real-time CAL and not a real-time 
OCAL such that its reversal is also a real-time CAL. We show another real-time 
CAL such that not only the reversal but also the concatenation with S* are also 
reEil-time CAL. 
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Proposition 7. The language L1L1{0, 1}* is a real-time CAL, where LI — {w : 
w — P(H or w = P0yl(P,y G {0, 1}*, j > 0}. 

Proof. Recall that Terrier [14] proved that LlLl is not real-time OCAL. We can 
construct a CA in which the configurations transit as in Fig. 4 for any input 
string U 1 U 2 with ui,U 2 G Ll,v 6 {0, 1}*. Both A and B axe areas of transition 
for recognizing that ui and U 2 are in LI. The coincidence of the two results are 
checked by a cell at Point x. □ 




Fig. 4. Transition of the CA recognizing L1L1{0, 1}* 



The following proposition is a consequence of Proposition 7. 

Proposition 8. There are languages L C such that L is not a real-time 
OCAL, and L, its reversal and LE* are real-time CAL’s. 



5 Some candidate languages for solving the open 
problems 

In this section, we discuss some candidate languages for solving the open prob- 
lems on real-time CAL’s zmd their closme properties. Ibarra and Jiang [8] con- 
jectmed that the language {w$w^ : w € {0, 1}+, n > 1} is not a real-time CAL, 
whereas its reversal is a real-time CAL. We refute their conjecture based on 
their result that : m,n > 1 and m divides n}, as well as its reversal, is a 

real-time CAL (Theorem 4.1 in [8]). 

Proposition 9. The language {wSw" : u; G {0, 1}"*",?! > 1} is a real-time CAL. 

Proof. We construct a CA with four layers as follows (Fig. 5). 

1. The first layer is similar to the CA for recognizing the double word language. 
The rightmost cell emits the signal do with the velocity 1 cell/ (unit time) 
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to left at time 0, and the signals d every time it recognizes that the input 
string is w^,k > 2. When the rightmost cell detects another double word 
with a different cycle such as uwwuww, it emits a special signal to discard 
the effects of the previous signals. 

2. In the second layer, the state $ moves to right as a signal. When $ meets 
the signals do and d of the first layer at a cell, it has a special state po and 
p, respectively. When the signal $ arrives at the right end and the rightmost 
cell recognizes the string u;*,A: > 2, it emits the signal q to left. Otherwise, 
the rightmost cell emits qo- 

3. In the third layer, every input symbol moves to right. The signal q checks 
that the left side input symbols moving right coincide with the postfix w'°. If 
a cell with the state d, or do, meets the signal q, or qo, respectively, it sends 
a signal r to the rightmost cell. 

4. The forth layer is equivalent to the CA for recognizing the language {u$u : 
u,v E {0, 1} and |u| divides |u|} in real time. We can construct such a CA, 
since the language {l’"0" : m,n < 1 and m divides n} is a real-time CAL 
[ 8 ], 

The signal r is generated, if and only if the input string is of the form w’Srt;-’ , 1 < 
i < j. The rightmost cell has an accepting state, when it receives the r signal 
and the forth layer tells that divides |w^j. □ 




Fig. 5. Transition of the CA recognizing {wSw" : w 6 {0, > 1} 



We conjecture that the language {ww : w E {0, l}+}{0, 1}* is a real-time 
CAL. The reasons for this conjectme cire as follows. 
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- The double word language {ww : tn 6 {0, 1}"*"} is so complex that it is not a 
real-time OCAL, as shown in Section 3. 

- In real-time recognition of CA, the rightmost cell has an important role to 
emit several control signals. In the case of concatenation of the form LS*, 
the role of the rightmost cell is considerably weaJcened. 

- The construction of CA for the language L\L\{0, 1}* in the proof of Propo- 
sition 7 suggests that this CA is an exception: It is not likely that there is a 
general procedme to convert any CA S into a CA recognizing the concate- 
nation of the real-time CAL of 5 and S*. 

Our conjecture is closely related to the conjecture by Ibarra and Jiang [8] that 
: w G {0, !}+}{0, 1}* is not a real-time CAL because of the following 
proposition. 

Proposition 10. If the language {ww : w € {0, 1}* is a real-time CAL, 

then : in G {0, 1}‘*'}{0, 1}* is also a real-time CAL. 

Proof. (Omitted.) 

Another candidate of non-real-time CAL is an unary language, since there is 
an important difference between real-time and linear-time recognition for unary 
languages: The real-time recognition depends on the enumeration of strings and 
not on the input string as shown in Proposition 5. On the other hand, the linear- 
time recognition generally depends on the length of the input. 

6 Concluding remarks 

In this paper, we investigated several languages for solving the open problems on 
real-time CA cmd OCA languages cuid their closure properties. Despite that the 
real-time recognition power of OCA has been considerably clarified, most fun- 
damental problems on the real-time recognition power of CA remain unsolved. 
The result of real- and linear-time recognition of CA is closely related to the 
work on parallel derivation of formed languages [10]. 

Fig. 6 summarizes the relations eimong the language classes, which is an up- 
dated version of the diagram in [8]. The arrows represent inclusion relations. 
The double boxes represent that the classes are proved to be closed under con- 
catenation. The box with symbol is the subclass of real-time CAL’s that we 
introduced in this paper. 
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Abstract. Define the complexity of a regular language as the number of 
states of its minimal automaton. Let A (respectively A!) be an n-state 
(resp. n'-state) deterministic and connected unairy automaton. Our main 
results can be summarized as follows; 

1. The probability that A is minimal tends toward 1/2 when n tends 
toward infinity, 

2. The average complexity of L{A) is equivalent to n, 

3. The average complexity of L{A) D L{A!) is equivalent to 
where C, is the Riemann “zeta” -function. 

4. The average complexity of L{A)* is bounded by a constant, 

5. If n < n' < P{n), for some polynomial P, the average complexity of 
L{A)L{A') is bounded by a constant (depending on P). 

Remark that results 3, 4 and 5 differ perceptibly from the corresponding 
worst case complexities, which are nn' for intersection, (n — 1)* -I- 1 for 
star and nn' for concatenation product. 



1 Introduction 

This paper addresses a rather natural problem: find the average state complex- 
ity of the basic operations on automata. It is certainly an important question 
for both theorical and pratical reasons. It is a part of the subject founded by 
Knuth in the sixties [Knu68,Knu69,Knu73], the analysis of algorithms. A general 
presentation and a complete introduction of this kind to problems can be found 
in [SF96]. 

However, surprisingly, almost no result is available in the literature. The 
worst case complexity of most operations is known [YZS94,Yu97], but the average 
case analysis seems to be an extremely difficult problem. The main reason is 
that the number of non-isomorphic deterministic and connected automata with 
n states and say, two letters, is not even known! 

This is why we restrict ourselves to the case of one- letter automata. But, 
even in this case, non-trivial eirguments of number theory are required to analyze 
elementary looking operations. 

Define the complexity of a regular language as the number of states of its 
minimal automaton. Let A (respectively A') be rm n-state (resp. n'-state) deter- 
ministic and connected unary automaton. Our main results can be summarized 
as follows: 
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1. The probability that A is minimal tends toward 1/2 when n tends toward 
infinity, 

2. The average complexity of L{A) is equivalent to n, 

3. The average complexity of L{A) fl L{A') is equivalent to ^^nn', where ^ 
is the Riemann “zeta” -function. 

4. The average complexity of L{A)* is bounded by a constant, 

5. If n < n' < P(n), for some polynomial P, the average complexity of L{A)L{A') 
is bounded by a constant (depending on P). 

Remark that results 3, 4 and 5 differ perceptibly from the corresponding worst 
case complexities, which are nn' for intersection, (n — 1)^ -t- 1 for star and nn' 
for concatenation product. 

The proofs are too long to be described in this paper. However, in Section 4, 
we present a sketch of one proof to illustrate the kind of technics used here. 

2 Notations 

If f,g are two functions from N x N into R, we say that / is equivalent to g 
(denoted by / x p) if there exists a function e from N x N into R such that the 
two following statements hold: 

- for all n, n' in N^, f{n,n') = (l + e{n,n'))g{n,n') 

— s{n,n') 0 when min{n,n'} -¥ oo 

If / is a function from N x N into R+ , we say that / is polynomially bounded 
by a non negative real constant C (denoted by f =4 pC) if, for every polynomicil 
P e N[X], there exists an integer Np G N such that, for each n, n' G N with 
n > Np and n <n' < P(n), f{n,n') < C. Of course C depends on the choice of 
P. 

For each n, n' G N, we denote respectively by n V n' and n t\n' the 1cm and 
the gcd of n and n' . We denote by d|n the fact that the integer d divides the 
integer n. 

Given a deterministic automaton A, |.A| denotes the number of its states and 
ll^ll the number of states of its minimal automaton. By extension, if L is a regular 
language, we denote by ||L|| the number of states of its minimal automaton, 
that is, its complexity. Note that if A{L) is any automaton recognizing L, then 

Let 5 be a finite subset of a set T. If / is a function from T into R, we denote 
by (/,5) the sum J2s€sfi^)- 

3 The number of minimal automata 

In this section we enumerate the minimal unary automata (see [Eil74] [HU79]) 
with n states. For this piupose, we establish and use the characterization lemma 
which is very useful for a combinatorial analysis of minimal unary automata. 
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To avoid any isomorphism problems, we fix a rule for the labels of the states. 
For every deterministic and coimected automaton with n states, with initial state 
qo, we label each state q with the smallest integer i such that qo.a' — q. This 
condition prevents two distinct automata from being isomorphic. 

A deterministic complete and connected unary automaton is always of the 
following form, for some A: G {0, . . . , n — 1} (the final states axe omitted): 




Therefore, such an automaton is totcdly determined by the integer k and its 
set of final states. More precisely, it is equal to one of the automata A(n, fc, F) 
defined as follows : given two integers k and n such that 0 < A: < n — 1 and 
a subset F of {0, . . . ,n — 1}, A(n,k,F) is the unary automaton whose set of 
states is Q = {0, . . . , n - 1} and transition function is given by g.a = 9 + 1 for 
0 £ 9 < ^ “ 2 and (n — l).a = k. The initial state of this automaton is 0 and its 
set of final states is F. 

The loop of A = A(n,k,F), denoted by loop{A), is the automaton A{n - 
k, 0, F') where F' = {i € fO, n - A: — 1 ] | f + A: 6 F}. The automaton A is simply 
cailled a loop if it is equal to its loop, that is, if and only if A: = 0. 

Loops play an important role in the next lemma, which characterizes minimal 
unary automata. Two states of cin automaton are said to have the same finality 
if they are either both final or both non-final. 

Lemma 1. (Characterization Lemma) An automaton A{n,k,F) is minimal if 
and only if the two following conditions hold: 

1. its loop is minimal 

2. if k ^ 0, the states k — 1 and n — 1 do not have the same finality. 

We are now recidy to evaluate the average number of minimal automata. 
First, denoting by the set of complete, deterministic and connected unary 
automata with n states (with the proper labels on their states), it is easy to see 
that \Un\ = n2". 

Next we enumerate the minimal n-loops (loops with n states). Fix an integer 
n. For every n-loop C — A(n, 0, F) define 

kmin{A) = min {A: e Il,nJ | F.afi — F} 

Note that kmin{F,) exists since F.a" = F. A n-loop £ is said to be primitive if 
hmini.F.') — n. 

We can characterize minimal loops: 
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Lemma 2. For each n-loop C, the minimal automaton of C has kmin{C) states 
and divides n. In particular a loop L is primitive if and only if it is 

minimal. 

Denoting by p, the Mobius function, we have the following result: 

Theorem 1. There are exactly minimal n-loops. This number 

is equivalent to 2". Furthermore, there are no more than non-minimal 

loops with n states. 

The proof of this theorem is very classical. Using Lemma 2, we can reduce the 
problem to the well-known problem of counting the number of primitive circular 
words on a two-letter alphabet, which justifies the definition of a primitive loop. 
This number is also n times the munber of irreductible polynomials of degree n 
over F 2 , the field with two elements, and a natural bijection has recently been 
found, using Galois theory arguments [Del99]. For a survey of contexts where 
the same kind of numbers appear, see [A1199]. 

This result is very important as it says that very few loops are not minimal. 
Thus, as a first approximation, we can consider that each loop is minimal. Indeed, 
for all the average analysis of this paper, unary automata behave as if their loops 
were minimal. Using the characterization lemma, we can give an equivalent to 
the number of minimal unary automata. 

Theorem 2. The number of minimal automata with n states is equivalent to 
n2"-^ 

We define the average number of states of the minimal automaton of Ein 
n-state automaton as p-j- following theorem shows that the 

number of states of the minimal automaton of a deterministic connected au- 
tomaton is very close to the number of states of this automaton: 

Theorem 3. The average number of states of the minimal automata of an n- 
state automaton is equivalent to n. 

The proof is not difficult, and the result claims that it is not often useful to 
spend time minimizing a rmary automaton. 

4 Intersection 

In this section, we give the average and worst case complexity of the intersection 
on unary automata. Remark that the union has exactly the same behavior as 
the intersection since the minimal automaton of a regular language L has the 
same number of states as the minimal automaton of its complement. 

Fix two integers n and n' greater than 2. For every {A, A') £ Un x Un>, of 
respective initial states qo and q^, define the product automaton A x A', as the 
automaton whose initial state is {qo, g^), whose set of states is the set of reachable 
pairs from the initial states (go.a*,gQ.a*), i € N. The transition function of this 
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automaton is defined by (q,q').a = (q.a,q'.a) and a reachable pair (q,q') is final 
if and only if q (respectively q') is a, final state of A (resp. A'). 

It is well-known that this automaton recognizes the intersection of L{A^ and 
L{A'). 

Our first result concerns the worst case complexity. It slighlty improves a 
result of [YZS94] (they only consider the case when n and n' are prime together), 
and use the fact that for n large enough, there always exist a prime number 
between n — n“ and n, for some real nvunber a 6]0, 1[ [BH96,Dav74,Hux72]): 

Proposition 1. In the worst case, the complexity of the intersection is equiva- 
lent to nn' . 



Denote by U the set of all complete, deterministic and connected imary au- 
tomata. The average complexity of the intersection is exactly 

where || x || is the fimction from UxU into N which maps {A, A') onto ||.4 x >l'||. 

Our main result is a precise evaluation of the average complexity of the 
intersection: 



\Un 



Theorem 4. The average complexity of the intersection of a n-state automaton 
and a n' -state automaton is equivalent to nn' 

The proof requires a result from analytic number theory established by G. 
Tenenbaum [Ten97] along classical techniques (see, e.g., [Ten96]). The result is 
interesting on its own account and we now state it formally. 

Theorem 5. [Tenenbaum] The following result holds: 



E ^vr=^(nnr(l + 0(!!^)) 

1< t <n 
l<t'<n' 

with z = min{n, n'}. Thus 



E 



.•w./^3C(3) 



1< t <n 
Ki'<n' 



27T^ 



{nn') 



i\2 



Sketch of the proof of Theorem 4: We exhibit an upper and a lower bound to the 
average of the intersection, both eqxxiveilent to nn'. For the upper bound, 
we use the fact that if .4, is a n-state automaton and A' a n'-state automaton, 
||L(X) n L(^')|| < |>1 X 4.'|. We can compute exactly the number of states of 
A X A'. Moreover the loop of A x A' contains |Zoop(.4)| V \loop{A')\ states and 
thus we can prove that 



E E E E \^oop{A)\ V \loop{A')\ 

A€U„ A€U„, A€U„ A€U„, 
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After some calculii, we conclude by Theorem 5 that the average of the intersec- 
tion is bovmded by a function equivalent to ^nn'. 

For the lower boimd, we construct a set G{1, 1') of pairs (£, C) where £ is 
an /-loop and £' is an I' -loop. This set is such that for every (£,£') G G{l,l'), 
£ X £' is minimal. Hence as 

n n' 

(II X ii,w„ xw„,) Z] 

1=1 i'=i {c,c')€G(i,i’) Aeu„ A'eu„ 

loop(A)=C loop(A')=C 

and since for every /-loop £ with 1 < / < n, there are exactly n-state 
automata whose loop is £, 

n n' 

(II X !!,/./„ X //„-) > 2”+” 

/=! /'=! 

Hence, to prove the theorem we have to construct a large enough set G{n, n') so 
that 

Y mi')\ 2-'2-''(/ V /') X 5] / V /' 

(=1 <'=i 1=1 i'=i 

and we conclude using Theorem 5. 

To construct G{1, 1'), we remove some subsets from B{1, /'), the set of all pairs 
of loops (£, £') such that £ is a /-loop and £' is a /'-loop. We first remove all 
the pairs of loops (£, £') such that £(£) or £(£') is either finite or cofinite. It 
is not difficult to see that there Eire no more them 2‘ -I- 2* such pEiirs in B{l,l'). 
Define the subset of B{l,l') obtained after removing such pairs of loops. 

Define the property V{l,V) that is true if and only if I A I' > por 

technical reasons we want that G{l,l') = 0 if V{l,l') is satisfied. This is not 
restrictive since they are not a lot of (/, /') that satisfy V. 

We first work in the case when V is not satisfied by / and /': we want to 
remove from H{l,V) the pairs (£,£') such that £ x £' is not minimal. Define 
B = £ X £'. We distinguish two kinds of pairs, according to whether / A /' divides 
||B|| or not. 

— If / A /' divides ||H||; we exhibit a condition sufficient to ensure that a pair of 
loops is such that its product is not minimal. The following lemma charac- 
terizes non-minimal loops in the pEirticular case / A /' = 1 : 

Lemma 3. Let 1,1' > 1 be two integers such that / A /' = 1. If C is a l-loop 
and £' a I' -loop then C x C is minimal if and only if both £ and £' are 
minimal. 

We want to use this lemma even if / A /' 1. We have to introduce some new 

notations. For every i S {0, • • • , d — 1}, define the loop £^ ^ = A(//d, 0, F^'^) 
where 



F^''^ = O' G {0, ■ • • , (l/d) — 1} I dj 4- i is a final state of £} 
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The construction of is motivated by the following property, which holds 
for every d dividing I and I' and every i £ {0, • • • , d — 1}: 

Cf X C'f - (£ X Of 

Fix d = I Al'. The integer d divides ||5|| by hypothesis. If B is not minimal 
then, by Lemma 2, ||B|| strictly divides \B\. Hence for every z £ {0, ■ • • , d— 1}, 
(£ X C')f is not minimal. Thus cf x C'f is not minimal and as l/dAl' /d = 
1, cf or C'f is not minimal, by Lemma 3. 

Therefore, if there exists z £ {0, • ■ • , d - 1} such that both cf and C'f are 
minimal then Cx C' is minimal. Using the fact that we are working on I, I' 
which does not satisfy V, we can bound the munber of pairs of loops such 
that B is not minimal and I A I' divides ||S|| by , for I < I'. 

— li d = I Al' does not divides the characterization of minimal products 
of loops is completely different in this case. We introduce an equivalence 
relation = on {0, • • • , n — 1} defined by 

z = j O there exists k, {i — j + A:||H||) mod d 

We first prove that every equivalence class contains the same number m 
of elements and that each class contmns at least two elements. Moreover if 
i = j then (£ x C')f and (C x C')f recognize the same language. Hence, 
since they have the same number of states, {C x C')f = {Cx C')f. With 
this considerations we can prove that there are at most < 2*/^2* 

pairs of loops such that their product is not minimal and such that d = I Al' 
does not divides ||.4 x ^'||. 

Hence we construct G{1, 1') by removing these pairs of loops. Putting all results 
together we can prove that if I and I' do not satisfy V then, for l<l', \G{l,l')\> 
2 i+i _ for some real a £]0, 1[. But 

Y 2“'2''2~'2-''(/ V I') < Cn'^ 

1=1 i'=i 

for some constant C. Hence by bounding the number of I, V satisfying V we can 
prove Theorem 4. 

Using the same kind of methods we can also prove that the result of Theo- 
rem 4 still holds if we consider the average on minimal automata only: 

Theorem 6. The following result holds: 



Thus for the intersection the average and worst cases only differ by a mul- 
tiplicative constant. The theorem also shows that the naive algorithm which 
constructs the product automaton cannot be substantially improved. 



\MU 
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5 Stzir operation 

The purpose of this section is to prove that the average state complexity of 
the star operation is bounded. S. Yu, Q. Zhuang and K. Salomaa have already 
proved that in the worst case it is quadratic in the number of states [YZS94]: 

Theorem 7. For every regular language L of complexity n, the complexity of 
L* is bounded by (n — 1)^ + 1. Furthermore, this upper bound is reached for every 
n > 1. 

We estabhsh the following result, where || *|| is the function from into N 
such that the image of an n-state automaton A is ||L(.A)*||. 

Theorem 8. There exists a constant C, € K'*’ such that for every n>2, 

i(l|-iw<a 

In the proof of the theorem we encode automata by words on the alphabet 
{0, 1}. Removing a negligible subset of Un containg all the automata such that 
not to consecutive states are both final, we reduce the problem to a problem of 
combinatorics on words, which is sufficient to prove the theorem. Remark that 
the bound found in the proof is approximatively 50, whereas an experimental 
computation gives a bound lower than 6. 

This result shows that the average behavior of the star operation is very 
different from its worst case behavior, since the first one is bounded whereas 
the second one has a quadratic growth. Moreover we can use this result to 
obtain an algorithm that constructs the minimal automaton of the star of a given 
regular language that has an average complexity in 0(1) whereas the classical 
construction is in O(n^). 

6 Concatenation product 

The purpose of this section is to prove that the concatenation product is poly- 
nomialy bounded. 

S. Yu, Q. Zhuang and K. Salomaa gave the following result: 

Theorem 9. fYZSOA] For every regular languages L and V such that ||L|| = n 
and||L'||=n', ||LL'|| < nn'. 

They also proved that the boimd is reached if n A n' = 1. 

With this result we Ccin obtain a equivalent to the worst case complex- 
ity of the product of two languages (once more we use number theory results 
[BH96,Dav74,Hux72]): 

Proposition 2. The complexity in the worst case of the concatenation product 
of two unary automata with respectively n and n' states is equivalent to nn' . 
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Now we axe now going to show that in the average case, the asymptotic 
complexity of the concatenation product is bounded, provided the growth of n' 
is bounded by a polynomial in n. The image of {A, A') G x Un' by || . || is the 
integer \\L{A)L{A')\\. 



Theorem 10. There exist C £ R'*' such that 



y 1 / I (II • ll>^n ^ hin') C 

y.Un'\ P 

Once again we establish the proof by removing negligible subsets of Un x 
U„> . We also encode automata with words and use combinatorics on words. We 
precede in three steps, for n <n' < P{n) for some polynomial P: 



\Un 



— We first remark that almost all pcurs of automata {A, A!) are such that 
L{A)L{A') recognizes every word of length between [n/2] and [3n/2]. This 
step is quite technical and uses basic combinatorial tools. 

— Almost all pairs satisfying the first condition are such that L{A)L{A!) is 
cofinite. To prove this we consider two cases; namely the loop of A contains 
more than ^/n states or not. 

— Finaly we precisely compute the size of the minimal automaton of L[A)L{A') 
for pairs of automata satisfying the two previous conditions 

Remark that the condition n < n' < P{n) is certainly necessary to obtain a 
bounded average complexity, but is not very restrictive in practice. 



7 Conclusion 

Putting all things together we can sununaxize our results as follows: 



Operation 


Worst case 


Average case 


Minimization 


anything in {1, • • • , n} 


~ n 


Stax operation 


(n - 1)^ -h 1 


<a 


Concatenation product 


X nn' 


4pC 


Intersection 


X nn' 
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Abstract. We separate the class of languages accepted by determinis- 
tic two-way one counter automata from the languages accepted by two- 
dimensional rebound automata. We also discuss the relationship of the 
classes to languages accepted by rebound automata with fc-dimensional 
input for k > 3. Further we answer the question whether the classes of 
languages accepted by deterministic or nondeterministic rebound auto- 
mata are closed under length-preserving homomorphisms negatively. 



1 Introduction 

The first investigation of finite automata operating on a two-dimensioneil tape 
is due to Blum and Hewitt, who proposed several models for sequential compu- 
tations on a grid representing a picture [1]. A survey of the ensuing research in 
this area has been given by Inoue and Takanami [4]. In contrast to the situation 
for one-dimensional input strings, on a two-dimensional input nondeterministic 
finite automata are strictly more powerful than deterministic finite automata. 

The rebound automaton, a vEiriant of the two-dimensional finite automa- 
ton that is more closely related to one-dimensional automata, was introduced 
by Sugata, Umeo, and Morita [11,7]. It will become clear from the description 
below that rebound automata are very similar to one-dimensional counter auto- 
mata. The main difference is that for rebound automata counting and accessing 
their input are not independent. The question that we will mainly investigate is 
whether this leads to a strictly weciker model of computation. 

Rebound automata receive a one-dimensional input string ai 02 • • ■ Um as the 
contents of the first row of an otherwise empty, square input tape. The border 
of this square is marked with a special symbol B: 
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The automata axe equipped with a read-only head that scans one tape-cell 
at a time. It cannot leave the input square and can be moved up, down, left, or 
right imder the control of a finite, deterministic or nondeterministic program. 
As usual the head is initially located on the left upper cell and the automaton 
can accept or reject input strings. 

While the separation of deterministic and nondeterministic two-dimensional 
finite automata mentioned above does not carry over to rebound automata, these 
automata axe able to accept many non-regulax languages [11], e.g. 

{a"6"c" I n > 1}, {ww^ I lu e {0, 1}*}, {ww j w G {0, 1}*}, 

where denotes the reversal of w. We sketch the way a rebound automaton 
decides the last language in the list in order to illustrate how the automaton can 
take advantage of the two-dimensional input tape. It first checks that the length 
of the input is even. Then it records the first input symbol and starts to move 
its head repeatedly one position to the right and two cells down. When it has 
reached the lower border it moves its head up onto the input string and compares 
the symbol read with the one recorded in its finite control. Note that the two 
symbols are at positions half of the input length apart. If the symbols don’t 
agree, the input is rejected. Otherwise the head returns to its initicil position 
and the next symbol is compared. This process is repeated until all pairs of 
corresponding symbols have been successfully compared or a mismatch occurs. 

The languages mentioned above can edso be accepted by deterministic two- 
way counter automata, one-dimensional automata with a single counter that 
can be incremented, decremented, and tested for being 0. Intuitively movements 
of the input head of a two-dimensional rebound automaton onto different rows 
of the input correspond to increment and decrement operations on a counter. 
It was shown by Morita, Sugata, and Umeo that every deterministic rebound 
automaton can be simulated by a deterministic two-way counter automaton [7]. 
Note that this fact is nontrivial, since a rebound automaton can recognize the 
border of its input tape while the counter automaton has no direct way for testing 
that the counter has reached a value that exceeds the input length. Despite the 
close relationships a fundamental difference remains between counter automaton 
and reboimd automaton. The rebound automaton is not able to access its input 
while the head is not scanning the first row of the tape. 

A language that appears to capture the essence of counting is 

L = {lu I ly G {0, 1}*, jtnjo - |u;|i} 

(jiclx is the number of symbols x in «;). Sugata, Umeo, and Morita pointed 
out that this language does not seem to be accepted by rebound automata and 
thus conjectured that counter automata are strictly more powerful than rebound 
automata [7]. 

A partial solution of the problem of separating counter automata and re- 
bound automata was obtained by SaJmmoto, Inoue, and Takanami [10]. Using 
a more complicated language than L defined above, which can be accepted by 
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a nondeterministic two-way counter automaton, they showed that no nonde- 
terministic reboimd automaton is able to decide this language correctly. This 
left open the corresponding question from [11,7] for deterministic automata, a 
problem mentioned again in [10] and [12]. 

In this paper we will design a problem that can be decided by a deterministic 
two-way counter automaton (in the strong sense of [12], the counter is bounded 
by the input length) but cannot be recognized by rebound automata, even in 
their nondeterministic version. In this way we strengthen the separation from 
[10] and prove the conjecture from [7]. The idea is to fool every reboimd au- 
tomaton by a suitable input string from an especially hard language and show 
that it necessarily accepts some illegal input along with the string in the lan- 
guage. In spirit this is similar to the well-known pumping theorems for one-way 
machines. Due to the more complicated flow of information it is however signifi- 
cantly harder to fool machines that admit re-reading of input segments. Another 
example where two-way machines have been fooled is the separation of counter 
and pushdown for deterministic machines by Duris and Galil [2]. As a second 
application of fooling rebound automata we will show non-closure under length- 
preserving homomorphisms. 



2 Separation of Counter Automata and Rebound 
Automata 

The proof of Lemma 2 uses an argument similar to the one for Lemma 1 of 
[10]. The idea of these proofs can easily be explained in terms of communication 
complexity [3]. We imagine that two communicating parties receive parts of the 
input of decision problem H (to be defined later) which is a “hard” problem for 
rebound automata. Then they cooperate by exchanging information based on 
the behavior of the automaton in order to correctly determine membership in 
the language. Now two requirements have to be satisfied by H: 

- The amount of information that can be transferred by a suitable determin- 
istic counter automaton between the input segments has to be suflacient for 
deciding H. 

— No rebound automaton can supply enough information for a correct decision. 

The aim is thus to define the hard set in such a way that there is a bottleneck 
that prevents each rebound automaton from transmitting a large amount of 
information. 

We first give the definition of witness language H C {o, 6, c, d, e, /, g, h, $}* 
that will be employed in the separation of counter automata and rebound auto- 
mata: 



H = {ie$(e”/)^"‘*‘^^ uihu2h---huk \ 

k,n > 0,w e {a, 6, c, d}", VI <i<k:UiEg*, 
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Intuitively this language encodes in the first portion w of each string a number 
i. The segment after $ encodes a set of numbers in unary notation with the 
property that i € {|uj| | 1 < j < A:}. 

Lemma 1. The language H can he accepted by a deterministic two-way one 
counter automaton. 

Proof. We will describe an algorithm that can be carried out by an automaton 
A as required by the lemma. 

First A checks in one sweep over the input tape that its input belongs to the 
regular language {a, 6, c, d}*${e, /}*{ 5 , h}*. Then it counts the length n of the 
prefix from {a, 6, c, d}* up to, but not including $ on its counter. Now A verifies 
that all blocks in e*f have the form e”/ by comparing the number of e’s with 
the counter contents. After each compcirison it restores the counter using the 
block e"/ just read and proceeds to the next block in e*/ until all blocks of this 
form have been visited. Then A stores n + 1 on its counter and moves its head to 
the n+ 1st block e”/. If there is no such block, A rejects. Then, starting from an 
empty counter, A counts the number of e’s and /’s read while moving its head 
to the left, thus computing (n + 1)^. Then it checks that the total number of f's 
is (n + 1)^. 

The technique described above for multiplication with n + 1 is now applied 
to the initial segment w of A’s input. First the number of a’s is counted and 
l^la • (n + 1) is computed. Keeping this number on the counter the number of b’s 
is counted, computing |u;|o • (n+ 1) + )ty|6- After another multiplication with n+ 1 
the c’s are coimted and the ultimate result \w\a • {n + 1)^ + |iy|(, • (n + 1) + |u;|c 
is stored on the counter. 

Now A starts to compare the counter contents with the length of the blocks 
Ui e g* one after the other. It does so by decrementing the counter while moving 
its head symbol by symbol over the block. If a comparison fails, A is able to 
restore the initial number stored on the counter by moving the input head back 
to the staring point of the comparison. If some block agrees with the counter 
contents, A accepts. Otherwise A rejects. We observe that during the operation 
of A the counter never exceeds the input length. Therefore A is strong in the 
sense of [12]. □ 

Lemma 2. The language H cannot be accepted by nondeterministic rebound 
automata. 

Proof. We will assume that there exists a nondeterministic reboimd automaton 
R that correctly decides H. Without loss of generality we may require that R, if 
it accepts, does so with its head on the upper left cell of the tape. Based on the 
size of R we will bound the amount of information transferred between the initial 
length n portion of the input {w in the definition above) and the remainder of 
the square input tape. This initial portion ai 02 • • • o„ (including the border cells 
next to it) is shown in boldface in the following picture: 
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Let us call the line between w (plus the neighboring cells of the border) and 
the rest of the tape the interface of this instance of the decision problem. Note 
that the interface has length n + 3. 

Suppose rebound automaton R has s internal states. Then the set of possible 
behaviors of R at the interface on a given input (without w's portion) can be 
fully described by a set C of pairs of the form ((i, p), {j, q)) with 1 < i,j < n + 3 
and p, q states of R. This set indicates for R leaving w's portion at position i of 
the interface and entering state p whether it can possibly return to the interface 
and cross it at position j entering state q (by the mode of acceptance R will 
eventually return to w if it accepts). Note that inputs having the same initial 
portion w and giving rise to the same set C will either all be accepted, or none 
of them will be in the language accepted by R. 

There are at most 2^ different sets that can occur for automaton R. 

We will now argue that this number is not sufficient for deciding H correctly. 

For words w € {a, b, c, d}" there are 



+ 6n^ + lln + 6 
6 



different nmnbers |u;|<, • (n + 1)^ + |«;|(, • (n + 1) + \w\c. Each subset M of these 
numbers with M = {mi, m 2 ,... ,mfc} can be encoded as • • • Zip'”* 

and used as the trailing portion of an input (together with $(e"/)(”"*'^^ ). If two 
strings for sets M\ ^ M 2 admit the same set C of pairs as defined above, we 
choose a number I G (Mi \ M 2 ) U (M 2 \ Mi). Then for a. w' e {a, b, c, d}" with 

e = 



\w'\a 



(n + 1)2 + 



\w'\b 



(n + 1) + |w;'|c as the initial length n portion of 



the input, the automaton will erroneously make the same decision on the strings 
encoding Mi and M 2 , showing that different sets necessarily give rise to different 
sets of pairs if H is to be decided correctly by R. 

There are more than 2" subsets consisting of numbers representable by w's 
in the set {a, 6, c, d}". For n > 96«2 we have 2"*^® > and two subsets 

are assigned the same C, contrary to the property required by the preceding 
paragraph. We conclude that a rebound automaton R accepting H cannot exist. 

□ 



From the two preceding lemmas and the simulation of deterministic reboimd 
automata by deterministic two-way coimter automata [7] we deduce the following 
hierarchy result. 
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Theorem 1. The class of languages accepted by deterministic rebound automata 
is properly contained in the class of languages accepted by deterministic two-way 
counter automata. 

3 Rebound Automata on Higher Dimensional Tapes 

A similar reasoning as above can be applied to rebound automata operating on 
higher dimensioned tapes by adapting the size of the interface and the corre- 
sponding constants. Thus no simulation of deterministic two-way counter auto- 
mata by multidimensional rebound automata is possible. It is then natural to 
ask whether every multidimensional reboimd automaton can be simulated by 
a two-way coimter automaton. In order to show that this is not possible, we 
restrict all automata in the sequel to unary (single letter) alphabets. 

On an input over a unary alphabet there are simulations between one-di- 
mensional fc-head automata and fc-dimensioned rebound automata. For the de- 
terministic case this was shown via bounded counter automata in [7]. The same 
kind of simulation works for nondeterministic machines and we have; 

Lemma 3. Over unary alphabets k-dimensional rebound automata and one- 
dimensional k-head automata are equivalent in accepting power. This statement 
holds for deterministic as well as for nondeterministic automata. 

Monien has shown that there is a proper hierarchy among the classes of unary 
languages accepted by multihead automata with an increasing number of heads 
[6]. Together with the last lemma this shows that there is a unary language 
accepted by a deterministic three-dimensional rebound automaton that cannot 
be accepted by a two-head finite automaton and hence by no deterministic two- 
way one coimter automaton. 

Corollary 1. The classes of languages accepted by deterministic two-way one 
counter automata and k-dimensional deterministic or nondeterministic rebound 
automata are incomparable for k> 3. 

We remark that such a statement is not true for counter automata with at 
least two counters, since they are known to be computationally universal [5]. 

4 Closure Properties 

Closure properties of classes of languages £w;cepted by rebound Turing machines 
have been investigated by Zhang e.a. [12]. They showed that for o(logn) space 
the classes are not closed under concatenation or iteration (their witness lan- 
guages can already be accepted by deterministic finite rebound automata). Using 
Sipser’s technique for halting deterministic speice bounded computations every 
deterministic rebound automaton can be transformed into a halting one. This 
establishes closure under complement and union. 
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A question left open in [12] was closure under length preserving homomor- 
phisms. We will show that this closure property does not hold for deterministic or 
nondeterministic rebound automata. The idea is that the application of a homo- 
morphism destroys information that the rebound automaton needs for deciding 
a witness language. 

Fix some n > 0, m = 2^" and let = be an equivalence relation defined on 
words in {a, b, $}”*. Two words u — uoui • • • Um-i and v = voV\ ■ • ■ Vm-i (with 
Ui,Vi 6 {a, b, $}) axe equivalent (u = w) if Uj = Vi for each 0 < i < m - 1 of the 
form i = + S"=o Cj • 2^^ with cj G {0, 1} for 0 < j < n — 1. Intuitively 

two words axe equivalent if they agree on symbols with offsets that have I’s in 
the n — 1 least significant digits of their binary expansions that appear at odd 
positions. 

In the sequel we will make use of some macros that summarize a sequence 
of operations of a rebound automaton. They trcmsform one position of its input 
head on the first row of an input of length 2’’ for some r > 0 into another one. 
Positions of the input head will encode numbers. Note that here the left most 
input cell is assigned the value 0. Some transformations and the way the are 
effected axe: 

Divide by 2 and determine the remainder: The head moves a single step 
down and two steps left, stopping as soon as the boundary is reached. Then 
the head is moved diagonally back onto the first row. 

Multiply by 2: Similar to the previous operation. 

Add 2’’“^ if possible: Move the head down, moving right once for every two 
cells visited (including the cell on the first row). If the right boundary has 
been reached, the process is reversed in order to restore the initieil value. 
Otherwise the lower boundauy will be reciched, in which case the head moves 
straight up onto the first row. 

Check if greater than 2’’~^ (or equal to 2'’”^): Like the previous opera- 
tion, but always return to the initial position after recording the relation to 

2r-l 

The language we will use in the proof of the non-closure result will be intro- 
duced next. 



K = {w$UiU2 ■ ■ • U2k_i I 

A: > 1, 3n > 1 : u; e {a, VI < i < 2* — 1 : Ui G {a, 6}^^", 

31 < i < 2* - 1 : = Ui} 

Lemma 4. The language K is the image under a length-preserving homomor- 
phism of a language that can be accepted by a deterministic two-dimensional 
rebound automaton. 

Proof. We will describe the computation of a rebound automaton R satisfying 
the conditions of the lemma. The language accepted by this automaton is defined 
like K, with the exception that exactly one of the ui's contains marked symbols 
a', b' at the positions that are relevant for equivalence. 
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First R checks that there is exactly one $ in the input string. By repeatedly 
dividing the head position by 2 and counting the number of iterations modulo 
2 it then verifies that the length of the prefix in {a, 6}*$ is of the form 2^" for 
some n > 1. Next R checks that the length of the entire input is a power of 2 
and that at least one symbol follows the $. 

For the following operations we divide the binary representation of each head 
position on the first row into two parts. The upper k bits define a block while 
the lower 2n bits specify a symbol within the block. Bits on an odd position of 
this latter part (except for the most significant one) will have a value of 1 except 
for intermediate results. 

The binary representation of a munber stored as R's head position will look 
like this: 

bk-lbk-2 ■ ■ ■ f>oOSn-llSn-2 . • . ISo 

with bo,-- - ,bk-i,So, - - . Sn-i € {0,1}. The number bk-ibk -2 - - - bo specifies a 
block while s„_is „_2 . . . sq designates a symbol. 

We will first argue that R is able to increment the block address and switch 
from one symbol address relevant for equivalence to the next one without in- 
terference between the two parts. For the block address R divides the current 
position by 4 and records the remainder (i.e. essentially Sq) in its finite control. 
Then it starts to repeatedly divide its head position by 2, adding if and 

only if the remainder was 1. Note that this shifts the s, ’s to the most significant 
digits. This process stops if a remainder 0 is encountered in an odd bit position 
indicating that all Sj’s have been shifted. Now R moves the head one position to 
the right, thus incrementing the block address. Then R repeatedly multiplies the 
head position by 2 moving bits from the upper positions to the least significant 
one. It can determine the most significant bit of the stored number by compar- 
ing it with This process stops when after an even, non-zero number of 

multiplications the stored number is below 2*^+^”“^. If this is the case R checks 
whether it is also bounded by 2*'+^"“^, in which case no overflow has occurred. 
An overflow is recorded in the finite control and the block address is reset to 0. 
Then the remainder initicilly stored in the finite control is shifted into the stored 
number, completing the cycle. 

The manipulation of the symbol address proceeds similarly. Now the sequence 
of even bit positions Sj is treated as a binciry niunber and is updated accordingly. 
Carries can be kept in the finite control. 

For checking its input in a first phase R moves its head from block to block 
and visits every symbol relevant for equivalence. The initial value stored on the 
head is 

00 ... 00 . 0111 ... 11 , 

k bits 2n bits 

which can be generated easily from the length of w%- Note that the format of the 
symbol address excludes $ from the set of relevant symbols. If a block contains 
a symbol a' or b', then all relevant symbols of this block have to be in {a',b'} 
and exactly one block should have this property. 
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In a second phase R keeps symbol addresses fixed and cycles through all 
blocks. It records a symbol in the first block w$ and compares it to the primed 
symbol found at the same symbol position of some block. This is repeated for 
each symbol position. If all symbols match (where a matches a' and b matches 
b') and all other tests are successful, R accepts its input. 

The length-preserving homomorphism h mapping the language accepted by 
i? to AT is defined by h{a) = h{a') = a, h(b) — h{b') — b, and h{$) = $. □ 

Lemma 5. Language K cannot be accepted by any (even higher dimensional) 
nondeterministic rebound automaton. 

Proof (sketch). Observe that there are 2^" equivalence classes of words of length 
m — 2^” and 2^ — 1 nonempty subsets of these classes, each of which can be 

specified by some input string. The number of behaviors of a rebound automaton 
at the interface between w$ with w G {a, 6}"*“^ and the remaining input is in 
20 (m ) _ In a similar way as in the proof of Lemma 2 it follows that no rebound 
automaton can decide K correctly. □ 

Theorem 2. The classes of languages accepted by deterministic or nondeter- 
ministic rebound automata are not closed under length preserving homomor- 
phisms. 

Proof. Combine the two preceding lemmas. □ 

5 Conclusion and Open Problems 

We could solve the open problem from [11, 7, 10, 12] about the relationship be- 
tween rebound automata and deterministic two-way one counter automata. Fur- 
ther we established non-closme under length-preserving homomorphisms of lan- 
guages accepted by rebound automata. Some problems that remain open are: 

— Is it possible to use a more “typical” example of a one counter language 
for separating deterministic rebound automata and deterministic counter 
automata? A language that captures the concept of counting more directly 
than om H is L — {w \ w G {0, 1}*, |t/;|o = |w|i}. Note that L can even be 
accepted by a one-way one counter automaton. Sugata, Umeo, and Morita 
conjecture that deterministic rebound automata cannot decide the language 
L [11]. 

— Is it possible to separate counter automata and rebound automata with 
bounded languages, i.e. subsets of • • • u>l for fixed words wi,W 2 ,-- ■ , uikl 
A bounded language that might be too difficult to be decided by a rebound 
automaton is {a"6" j n > 0}. This Icmguage can be accepted by a determin- 
istic two-way one counter automaton [9], but the algorithm requires access 
to the input while the counter is non-empty. 

— Can every nondeterministic rebound automaton be simulated by a nondeter- 
ministic two-way one counter automaton? Note that the simulation from [7] 




250 Holger Petersen 



maJces essenticil use of the simulated automaton being deterministic. If such 
a simulation exists, the separations in [10] as well as the one in the present 
work show a strict hierarchy of the nondeterministic models. 

— What is the relationship between determinism and nondeterminism for re- 
bound automata? The argument in [1] that separates deterministic and non- 
deterministic two-dimensional automata rehes on the presence of non-blank 
cells in other part of the input than the first row. 
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Abstract. Free Binary Decision Diagrams (FBDDs) are a data struc- 
ture for the representation and manipulation of Boolean functions. Effi- 
cient algorithms for most of the important operations are known if only 
FBDDs respecting a fixed graph ordering Me considered. However, the 
size of such an FBDD may strongly depend on the chosen graph ordering 
and efficient algorithms for computing good or optimal graph orderings 
are not known. In this paper it is shown that the existence of polyno- 
mial time approximation schemes for optimizing graph orderings or for 
minimizing FBDDs implies NP = P, and so such algorithms are quite 
unlikely to exist. 



1 Introduction 

Many variants of Binary Decision Diagrams (BDDs) have been investigated as 
a data structure for Boolean functions. Such data structures have several ap- 
plications, in pcirticular in computer mded hardware design. They are used in 
programs for, e.g., circuit verification, test pattern generation, model checking 
and logic synthesis. Data structures for Boolean functions should allow the effi- 
cient representation and manipulation of important functions. The most popular 
data structure proposed for this purpose me Ordered Binary Decision Diagrams 
(OBDDs), which were introduced by Bryant [5]. Many generalizations of OBDDs 
have been considered since there are many important functions for which OBDDs 
are too Imge to be stored in a computer memory. In this paper we focus on a 
particular extension of OBDDs, namely Free BDDs (FBDDs). 

FBDDs have also been considered in complexity theory under the name read- 
once branching programs. There are mamy papers presenting lower boimd meth- 
ods for FBDDs. The first ones are due to Wegener [16] and Zak [17], and in 
the paper of Simon and Szegedy [14] most previous approaches are handled in a 
unified way. Already in the early paper of Fortune, Hopcroft and Schmidt [6] it 
was shown that FBDDs are exponenticilly more powerful than OBDDs by pre- 
senting an example of a function with polynomial FBDD size but exponential 
OBDD size. The algorithmic aspects of FBDDs are investigated by Sieling and 
Wegener [13] and Gergov and Meinel [8]. It turned out that many but not all 
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operations on Boolean functions which can be performed efficiently on functions 
represented by OBDDs can also be performed efficiently on functions represented 
by FBDDs if only FBDDs according to a fixed graph ordering are considered. 
This is similar to OBDDs where mciny operations can be performed efficiently 
only if the considered OBDDs have the same variable ordering. Graph orderings 
are a generalization of variable orderings. A graph ordering G defines for each 
input the ordering in which the variables have to be tested in FBDDs respecting 
G. Unlike OBDDs, FBDDs allow different orderings for different inputs. FBDDs 
respecting a graph ordering G are called G-FBDDs or G driven FBDDs. 

We remark that (similar to OBDDs) the size of a G-FBDD for a particu- 
lar function may strongly depend on the chosen graph ordering G. So it is an 
important problem to choose a good graph ordering. A heuristic for computing 
graph orderings of a tree-like shape has been proposed by Bern, Meinel and 
Slobodova [2]. An algorithm with a double exponential worst-case run time for 
minimizing FBDDs was presented by Gunther and Drechsler [9]. This algorithm 
can also be used to compute optimal graph orderings, since for each FBDD H 
a graph ordering G can easily be computed so that H is a. G-FBDD, see Sieling 
and Wegener [13]. However, the question whether there are efficient algorithms 
for computing good or optimal graph orderings remains open. In this paper we 
consider the following two closely related optimization problems. 

MinFBDD 

Instance: A Boolean function f described by an FBDD G. 

Problem: Compute an FBDD for / which has minimal size. 
OntGraphOrdering 

Instance: A Boolean function / described by ein FBDD G. 

Problem: Compute a graph ordering G* so that the size of a G*-FBDD for / 
is minimal among all FBDDs for /. 

In Sections 3 and 4 we shall prove the following hardness results. 

Theorem 1. If there is a polynomial time approximation scheme for MinFBDD, 
then NP = P. 

Theorem 2. If there is a polynomial time approximation scheme for 
OptGraphOrdering, then NP = P. 

Hence, it is unlikely that the two considered problems have polynomial time 
approximation schemes emd we get the justification to give up the search for 
polynomial time approximation schemes. It remains open, whether there are 
polynomial time approximation algorithms with some larger performance ratio 
for these problems. 

We remark that for OBDDs there me similar optimization problems, namely 
MinOBDD, i.e., the problem to compute a minimal size OBDD for a function 
given by an OBDD, and OptVarOrdering, i.e., the problem to compute an opti- 
mal variable ordering for a function given by an OBDD. However, these problems 
are polynomially related and, therefore, usually not explicitly distinguished. It is 
not clear whether MinFBDD and OptGraphOrdering are polynomially related. 
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since it is not known whether a polynomial time algorithm for OptGraphOrder- 
ing imphes a polynomial time algorithm for MinFBDD. The NP-hardness of 
MinOBDD and OptVarOrdering for OBDDs for multi-output functions was 
shown be Tani, Hamaguchi and Yajima [15] and for OBDDs for single-output 
fimctions by Bolhg and Wegener [4j. In Sieling [11] it is shown that the existence 
of polynomial time approximation schemes for these problems imphes P = NP 
and in Sieling [12] that it is NP-hard to approximate these problems up to any 
constant factor. 



2 Preliminaries 

A Binary Decision Diagram (BDD) for the representation of functions /i , . . . ,fm 
over the variables xi, . . . is a directed acyclic graph. The graph consists of 
terminal nodes, which have no successor and which are labeled by 0 or 1, and 
internal nodes. Each internal node is labeled by a variable and has an outgoing 
0-edge and an outgoing 1-edge. In free BDDs (FBDDs) on each directed path 
each variable occiurs at most once as the label of a node. Examples of FBDDs are 
shown on the right side of Figure 1. In the figure edges are directed downwards. 
We draw 0-edges as dashed lines and 1-edges as sohd fines. Internal nodes are 
drawn as circles and terminal nodes as squares. 

Each node v of an FBDD represents a Boolean function In order to 
evaluate this function for an input a = (oi, . . . , o„) we start at v. At each Xi-node 
we follow the outgoing Oj-edge. Finally, a terminal node is reached, and /„(o) is 
equal to the label of this terminal node. In an FBDD for the representation of 
/i) ■ • • ) /m for each function fj there is a pointer to a node representing fj. 

A graph ordering describes for ecich input a permutation of the variables. 
Formally, a graph ordering G is a directed acyclic graph with one source node 
and one terminal node. Each internal node is labeled by a Boolean variable and 
has an outgoing 0-edge and an outgoing 1-edge. Furthermore, on each path firom 
the somce to the terminal node each variable is tested exactly once. Similar 
to FBDDs each input a — (ci,... ,a„) defines a path from the source to the 
terminal node of the graph ordering. For a graph ordering G we call an FBDD G' 
a G-FBDD or G driven FBDD if for each input the veiriables on the computation 
path in G' axe found in the same ordering as on the computation path in G, where 
on the computation path in G' variables may be omitted. 

We repeat some properties of FBDDs. The usual reduction rules for OBDDs 
can also be applied to FBDDs. By the deletion rule a node v whose successors 
coincide can be deleted after redirecting the edges leading to v to its successor. 
By the merging rule nodes v and w with the same label, the same 0-successor and 
the same 1-successor can be merged, i.e., the edges leading to v axe redirected to 
w, and V is deleted. An FBDD is called reduced if neither of the reduction rules 
is applicable. Sieling and Wegener [13] prove that reduced G-FBDDs axe unique 
up to isomorphism. Hence, we may talk about the (reduced) G-FBDD for some 
fimction /. 
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If two FBDDs Gi and G 2 do not respect the same graph ordering, no deter- 
ministic polynomial time algorithm is known for the equivalence test, i.e., the 
test whether G\ and G 2 represent the same function. A probabilistic polynomial 
time equivalence test with one-sided error was presented by Blum, Chandra and 
Wegman [3]. This algorithm always classifies equivalent FBDDs as equivalent 
but with a probability of at most 1/2 it may also classify nonequivalent FBDDs 
as equivalent. Hence, the equivalence test for FBDDs is contained in coRP. 

We call a node v of an FBDD redundant, if at both successors of v the 
same function is represented. Then v can be deleted by redirecting all incoming 
edges to one of the successors of v. However, different from OBDDs the most 
eflBcient known algorithm to detect redundant nodes is to apply the probabilistic 
equivalence test to the successors of v. 

For the definitions of notions concerning approximation algorithms we follow 
Garey and Johnson [7]. Let 77 be some minimization problem, let Dn be the 
set of instances of U and let A be some algorithm computing legal solutions of 
n. For I € Dn let A{I) be the value of the output of A on instance I and let 
OPT {I) be the value of an optimal solution for 7. The performance ratio of A is 
defined as snpj^n^{A{I)/OPT{I)}. A polynomial time approximation scheme 
A is a polynomial time algorithm that gets besides 7 S Dn an extra input e > 0. 
For each e > 0 it has to achieve a performance ratio of at most 1 -f e. 

3 The Complexity of FBDD Minimization 

We prove the nonapproximability results by a reduction from a variant of satisfi- 
ability which we call e robust 3-SAT-6 (eRob3SAT-6). This problem is a promise 
problem. We remember that an algorithm for a promise problem has to be suc- 
cessful only on instances fulfilling the promise. It does not have to detect that 
the promise is not fulfilled and in this case it may behave Eirbitrarily. 
gRob3SAT-6 

Instance: A set U of variables and a set C of clauses fulfilling the following 
properties: 

1. Each clause consists of at least two and at most three literals and 
each variable occurs in each clause at most once. 

2. Each variable occms at least once and at most b times. 

3. Any two clauses share at most one litered. 

Promise: If the set of clauses is not satisfiable, for each assignment to the 
variables at least e|C| clauses are not satisfied. 

Problem: Is there a satisfying assignment to the variables? 

The promise ensures a gap between satisfiable and nonsatisfiable inputs, 
which helps to prove the nonapproximability result for MinFBDD. The restric- 
tions on the input make the reduction of gRob3SAT-6 to MinFBDD easier. We 
do not know whether a hardness result for gRob3SAT-6 was explicitly stated in 
the literature. The following hardness result follows easily by reexamining the 
proofs of Arora, Lund, Motwani, Sudan and Szegedy [1] and Papadimitriou and 
Yannakakis [10]. 
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Theorem 3. There are constants 6 G N and e > 0 so that eRobSSAT-b is NP- 
hard. 

Proof of Theorem 1. We assume that there is a polynomial time approxima- 
tion scheme A for MinFBDD and construct a polynomial time algorithm for 
eRob3SAT-6 where b and e are the constants ensured by Theorem 3. Let {U — 
{ui, . . . , u„}, C = {Cl, . . . , Cm}) be an instance for sRob3SAT-6 fulfilling the 
promise. We are going to present a polynomial time algorithm for the trans- 
formation of (U,C) into an instance H for MinFBDD. On H we apply the 
approximation scheme A, and from the size of the result we can decide whether 
{U, C) is satisfiable. Since on instances not fulfilling the promise algorithms for 
eRob3SAT-6 may behave arbitrarily, we do not have to consider this case. 

We construct an FBDD for a function F which is defined over the set V = 
{a;i,... ,x„,x'i,... ,x'„}U{y}u{z^i j i € {l,2,3},j 6 {1,... , n-f-m}} of variables. 
The fimction F is composed of the functions /i, . . . , fn+m defined by 

fi — Xi A x'i for 1 < i < n, 

fn+i — /\ A x'- for 1 < i < m. 

Intuitively, the variable Xj corresponds to the veiriable u, of the instance {U, C) 
and x'i corresponds to the negation of «<. For each variable Uj the function fi is 
introduced, and for each clause Cj the function /„+< is introduced where /„+< 
computes the conjunction of the variables corresponding to the literals in Cj. In 
the following we use as an abbreviation of Zj A Z 2 A Then we define 

p _ j fi if 2 ^ = • • • = 2 *“^ = 1 and 2 * = 0, 

^ ~ [y ifz^ = ... = 2 ”+"* = 1 . 

An FBDD for F is shown in the left side of Figure 1. We see that the FBDD 
consists of a switch that chooses which of the functions fi has to be evaluated. 
Figure 1 shows that it is easy to construct FBDDs for /i , . . . , fn+m in polynomial 
time and to combine these FBDDs to an FBDD for F in polynomial time. 

Let L denote the total number of literals in the clauses in C. Let 5 = ej (216). 
We apply the polynomial time approximation scheme A for MinFBDD on the 
constructed FBDD for F for the performance ratio 1-1-6 and obtain an FBDD 
G. By the following claim it suffices to compare the number of internal nodes of 
G with (1 -I- 6)(5n -I- 2m -I- L -I- 1) in order to decide whether {U, C) is satisfiable. 
Hence, Theorem 1 follows from the cletim and Theorem 3. 

Claim. ({/, C) is satisfiable iff G consists of at most (1 + 6)(5n -I- 2m -I- JD -1- 1) 
internal nodes. 

In order to prove the claim we first assume that {U, C) is satisfiable. Let <r 
be a satisfying assignment. We show that the size of a minimal FBDD for F is 
bounded by 5n + 2m -|- L -|- 1 by presenting an FBDD of this size. Since we chose 
the performance ratio 1 -|- 6 for A, it follows that the size of the result of A is 
bounded by (1 -I- 6)(5n + 2m -I- L -I- 1). 
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Fig. 1. The shape of an FBDD for the function F and FBDDs for fi — Xi A xj and for 
f„+j = Xi A x'i A X3, i.e., the function corresponding to the clause Cj = {ui,U2, U 3 } 



It is easy to construct an FBDD for /i, . . . , /„ with 2n internal nodes in such 
a way that in the representation of fi the variable Xj is arranged before xj if Uj 
has the value 0 in cr, and x^ is arranged before Xi otherwise. For each function 
fn+i we construct an FBDD consisting of |Cj| internal nodes, where the last 
variable is a variable corresponding to a literal of Cj that is satisfied by cr. If we 
join the constructed FBDDs for /i, . . . , fn+mi the last internal node of fn+i can 
be merged with a node of one of the FBDDs for /i, • • ■ , /n- Hence, the FBDD 
for all these functions consists of 2n + “ 1) 2n + L — m internal 

nodes. Finally, we construct from this FBDD for /i, . . . , fn+m an FBDD for F as 
outlined in Figme 1. Then the number of nodes labeled by y and the 2 :- variables 
is 3n + 3m + 1 and the total number of internal nodes is 5n + 2m + L + 1. This 
implies the only-if part of the claim. 

In order to prove the other implication of the claim we assume that {U, C) is 
not satisfiable. By the promise for each assignment to the variables in U at least 
em clauses are not satisfied. We show that a minimal FBDD for F consists of 
more than (1 + 5){5n + 2m + L + 1) internal nodes. Then also the output of A 
has to consist of more than this number of nodes, which implies the claim. 

We start with an arbitrary FBDD G for F. If this FBDD does not have the 
shape of the FBDD in Figure 1, i.e., if the z-variables are not arranged in the top 
of the FBDD, we shall rearrange this FBDD without changing the represented 
function and without increasing the size so that afterwards the z-variables are 
arranged as shown in Figure 1. Then the number of nodes labeled by j/ or a 
z-variable is minimal, since F essentially depends on each of these variables 
and the FBDD only contains one node testing each of these variables. Finally, 
we compute the minimal size of an FBDD representing /i , . . . , fn+m under the 
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assumption that the instance (U,C) is not satisfiable. Altogether, we obtain a 
lower bound on the size of an FBDD for F. 

Since our goal is to prove a lower bound on the FBDD size, it is not necessary 
to present a polynomial time algorithm for the rearrangement. In fact, we always 
assume that the considered FBDD does not contain redundant nodes. We already 
remarked in Section 2 that redundant nodes can be deleted by redirecting the 
incoming edges to one of the successors, but that no deterministic polynomial 
time algorithm for the detection of redundant nodes is known. 

In Figme 1 three 2 ^-nodes are surrounded by a dotted line. We call this 
arrangement of z^-nodes a z^-block. In the same way we define z^-blocks. The 
first step of the rearrangement is to make sure that in the FBDD the tests of 
z^-variables are always arranged as z^-blocks. Let oi, 02 and 03 the numbers of 
Zj-, Z 2 - and Z 3 -nodes, respectively. W.l.o.g. let ai be the minimum of these three 
munbers. Then we replace in G the variables Z 2 and Z3 by the constant 1, i.e., we 
redirect each edge leading to a node labeled by Z2 or z| to the 1-successor of this 
node. The resulting FBDD represents the function F|^i _2 ^i=i. Afterwards, we 
replace each zj-node u by a z^-block, i.e., we create a z^-block and redirect all 
edges leading to v to this z^-block. The O-edges leaving the nodes of the z^-block 
are directed to the 0-successor of v and the 1-edge leaving the last node of the 
z^-block is directed to the 1-successor of v. It is easy to verify that we again 
obtain an FBDD for F and that the size does not increase. In the same way we 
may ensure that the tests of the z^-variables are arranged as z^-blocks and so 
on. We call the resulting FBDD again G. 

The next step is to ensure that the z’-blocks are arranged in the top of the 
FBDD as shown in Figme 1. In order to show that it is possible to rearrange 
the FBDD in such a way without increasing the size we define the property P(j) 
of G. The FBDD G always has the property P{0). For j G {1, . . . ,n + m} the 
FBDD G has the property P{j) if the z^-block, . . . , z^ -block in G Eire arranged 
as shown in Figure 1, i.e., at the somce there is a z^-block, the 1-successor of 
the last node of this block is a z^-block and so on up to the z^ -block. 

Lemma 1. Let j € {!,... ,n + m}. IfG has the property P{j — 1) and does not 
have the property P{j), it contains at least two -blocks. 

We omit the technical proof of this lemma. Now we can rearrange G in the 
following way. We search for the smallest j so that G does not have the property 
P{j). Then the z^-, , z-'^^-blocks sure arrEmged as shown on the left side 

of Figme 2. (In order to simphfy the figure z^-blocks are drawn as ordinary 
internal nodes of an FBDD.) Since we may assume that G does not contain 
redundant nodes, the part of G that computes the functions /i, . . . , fj-i does 
not contain tests of z-variables. Let v be the node which is the 1-successor of 
the last node of the z-’^^-block of G (if j = 0 let v be the source of G). Let 
G* be the FBDD starting at v. In G* we replace the variables z(, Z2 and Z3 by 
the constant 1, i.e., we redirect the edges leading to a node labeled by one of 
these variables to the 1-successor of this node. This does not Eiffect the part of 
the FBDD computing /i, . . . ,fj-i as remarked above. By Lemma 1 at least 6 
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Fig. 2. The construction of an FBDD with property P{j) from an FBDD with property 

PU - 1) 



internal nodes are deleted by the replticement. Then we create a z-^-block and 
redirect the edge leading to v to this -block. As 0-successor of the Z'^-block we 
create an FBDD computing fj, which consists of at most 3 internal nodes. The 
1-successor is G* after the replacement of z{, and z^ by 1. The resulting BDD 
is shown in the right of Figure 2. It is easy to see that the constructed BDD is 
an FBDD for F. The number of interned nodes does not increase since at most 
6 internal nodes are inserted and at least 6 internal nodes are deleted by the 
replacement. This is the reason why for the selection of each of the functions ft 
three instead of only one z-variable is used. It is easy to see that the FBDD has 
the properties P(l), . . . ,P{j)- Hence, we may iterate this procedure until the 
FBDD has the shape shown in Figure 1. It consists of 3{n + m) -I- 1 nodes labeled 
by y and the z- variables, which is optimal, and an FBDD for /i, . . . , fn+m- In 
the following we estimate the size of this FBDD. 

Since the FBDD does not contain redundant nodes, the representation of 
/i, " ,fn consists of exactly 2n internal nodes. We interpret the relative or- 
dering of Xi and x'^ as an assignment to the variable Uj where Uj = 0 iff in the 
representation of fi the variable is arremged before xj. For the representation 
of fn+i there are |Cj| internal nodes if we ignore mergings, because the FBDD 
does not contain redimdant nodes. Since each two clauses share at most one 
literal, at most one node of the representation of fn+i, namely the node of the 
last level, can be merged with some other node, neunely with a node of the rep- 
resentation for /i, . . . ,fn or with a node of the representation of fn+j- In the 
former case C, is satisfied by the assignment defined above. If Cj is not satisfied 
by this assignment, a node of the representation of f„+i may be merged with a 
node of the representation of f„+j but not with a node of the representation of 
/i) • • • ,fn- Since two clauses share at most one htercil, the representation of fn+i 
contains at least \Ci\ — 1 nodes that zu:e not merged with any other node. Since 
each variable occurs in at most b clauses, at most b representations of fimctions 
fn+i that correspond to unsatisfied clauses can share a node. Hence, besides the 
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nodes that cannot be merged with other nodes, for each b unsatis- 
fied clauses there is at least one node. By the promise at least em clauses axe not 
satisfied. Hence, the representation of the functions /i , . . . , fn+m consists of at 
least 2n-|-^^j(|C'i|-l)-|- em / b nodes. Together with the3(n-|-m)-f-l nodes la- 
beled by y and the z- variables, the FBDD contcdns at least 5n-f2m-}-L-|-l-|-em/6 
nodes. By the choice of 5 and because of the inequalities L < 3m, n < 3m and 
(w.l.o.g.) m > 1, we have 5n + 2m em/b > (1 + S)(5n -I- 2m -I- T -f 1). 

This completes the proof of the claim and of Theorem 1. 

4 The Complexity of Optimizing Graph Orderings 

In order to prove Theorem 2 we assume that there is a polynomial time ap- 
proximation scheme B for OptGraphOrdering. We try to adapt the proof of 
Theorem 1. This means we construct for the instance (U, C) of eRob3SAT-6 an 
FBDD for the function F as described in the last section and apply B on this 
FBDD. Now we get the problem that the result of B is a graph ordering H 
instead of an FBDD, while by the results of Section 3 we only know of a relation 
between the size of an B-FBDD for F and the satisfiability of ({7, C). Hence, we 
would like to compute an ff-FBDD for F in order to determine its size. However, 
for the computation of an if-FBDD from a graph ordering H and a function 
given by an FBDD no polynomial time algorithm is known. Hence, we use a 
different approach which only works in our special situation. 

Assume for a moment that it is possible to construct an H-FBDD G for the 
function F. Then we may apply transformation steps similar to those described 
in the proof of Theorem 1 on G in order to obtain an equivalent FBDD G' 
with the shape shown in Figure 1. By these steps the size of the considered 
FBDD does not increase. Hence, we may count the number of nodes of G' in 
order to decide whether {U, C) is satisfiable. In the following we show that we 
can compute G’ from H in polynomial time without computing G. Hence, it 
is possible to determine in polynomizd time whether (17, C) is satisfiable. This 
implies Theorem 2. 

Now let a graph ordering H be given. In order to construct G' it suffices 
to determine for each function fi the relative ordering of the variables that fi 
depends on. In the following we show how to determine the relative ordering 
of and x[. During the rearrangement of the hypothetical if -FBDD G the 
following steps me performed without increasing the size: 

1. It is made sme that all tests of z^-vmiables Eire arranged as z^-blocks. 

2. If afterwards at the source there is no z^-block, the FBDD contains at least 
two z^-blocks. Hence, it is possible to reeirrange the FBDD by creating a 
new z^-block as the source, whose 0-successor is an arbitrary FBDD for fi 
and whose 1-successor is the previous FBDD after replacing the z^-variables 
by 1. 

3. At the 0-successor of the z^-block the function fi is computed. Since /i only 
depends on xi and x'l, all nodes labeled by other variables axe redundant 
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and can be removed. Prom the resulting FBDD for fi we can determine the 
relative ordering of xi and x[. 

It is crucial to observe that the relative ordering of xi and x[ is not deter- 
mined by H and can be chosen arbitrarily, if at the source of H a variable that 
is not a z^-variable is tested, or if in the third step there are computation paths 
with both relative orderings of xi and x[. Hence, we can determine the relative 
ordering of x\ and x'j directly from H: If at the source of H there is not a z^- 
variable, then we may choose the ordering arbitrarily. Otherwise we consider the 
graph ordering H* that is obtained from H by replacing the z^-variable at the 
source by 0 and the other z^-variables by 1. This is the same replacement as in 
the rearrangement of the z^-nodes to z^-blocks. By a simple depth first search 
approach it can be determined whether there is a computation path in H* on 
which xi is arranged before x[, whether there is computation path on which x[ 
is arranged before xi, or whether both types of computation paths occur. In the 
latter case the relative ordering of xi and x[ can be chosen arbitrarily, and in 
the former cases the relative ordering is the same as in H * . 

Afterwards, we may replace the z^-variables by the constant 1 and may pro- 
ceed by determining the relative ordering of X 2 and Xj, and so on. In the same 
way we may determine the relative orderings of the variables of the functions 
corresponding to the clauses. After computing all these orderings we can con- 
struct an FBDD for F as shown in Figure 1, reduce this FBDD and count its 
number of nodes. In the same way as in the last section we decide whether {U, C) 
is satisfiable. We remark that in the case that {U, C) is satisfiable this algorithm 
does not necessarily compute a satisfying assignment since it is sufficient that 
the algorithm obtains a graph ordering so that the size of the corresponding 
FBDD is bounded by (1 -|- 5)(5n -I- 2m -f fr -b 1). Such a graph ordering does not 
necessarily correspond to a satisfying assignment but merely to an assignment 
for which at most e\C\ clauses axe not satisfied. 

Acknowledgment 

I thank Ingo Wegener for fruitful discussions on the proofs in this paper. 

References 

1. Arora, S., Lund, C., Motwani, R., Sudan, M. and Szegedy, M. (1992). Proof veri- 
fication and hardness of approximation problems. In Proc. of 33rd Symposium on 
Foundations of Computer Science, 14-23. 

2. Bern, J., Meinel, C. and Slobodova, A. (1996). Some heuristics for generating tree- 
like FBDD types. IEEE Transactions on Computer-Aided Design of Integrated 
Circuits and Systems 15, 127-130. 

3. Blum, M., Chandra, A.K. and Wegman, M.N. (1980). Equivalence of free Boolean 
graphs can be decided probabilistically in polynomial time. Information Processing 
Letters 10, 80-82. 




The Complexity of Minimizing FBDDs 261 



4. Bollig, B. and Wegener, I. (1996). Improving the variable ordering of OBDDs is 
NP-complete. IEEE Transactions on Computers 45, 993-1002. 

5. Bryant, R.E. (1986). Graph-based algorithms for Boolean function manipulation. 
IEEE Transactions on Computers 35, 677-691. 

6. Fortune, S., Hopcroft, J. and Schmidt, E.M. (1978). The complexity of equiva- 
lence and containment for free single variable program schemes. In Proc. of 5th 
International Colloquium on Automata, Languages and Programming, LNCS 62, 
227-240. 

7. Garey, M.R. and Johnson, D.S. (1979). Computers and Intractability: A Guide to 
the Theory of NP-Completeness. W.H. FVeeman. 

8. Gergov, J. and Meinel, C. (1994). Efficient Boolean manipulation with OBDD’s 
can be extended to FBDD’s. IEEE Transactions on Computers 43, 1197-1209. 

9. Gunther, W. and Drechsler, R. (1999). Minimization of free BDDs. In Proc. of 
Asia and South Pacific Design Automation Conference, 323-326. 

10. Papadimitriou, C.H. and Yannalcakis, M. (1991). Optimization, approximation, 
and complexity classes. Journal of Computer and System Sciences 43, 425-440. 

11. Sieling, D. (1998). On the existence of polynomial time approximation schemes for 
OBDD-minimization (extended abstract). In Proc. of 15th Symposium on Theo- 
retical Aspects of Computer Science, LNCS 1373, 205-215. 

12. Sieling, D. (1998). The nonapproximability of OBDD minimization. ECCC Report 
TR98-001, Revision 1 (available from www.eccc.uni-trier.de). 

13. Sieling, D. and Wegener, I. (1995). Graph driven BDDs — a new data structure 
for Boolean functions. Theoretical Computer Science 141, 283-310. 

14. Simon, J. and Szegedy, M. (1993). A new lower bound theorem for read-only-once 
branching programs and its applications. In Advances in Computational Complexity 
Theory, Jin-Yi Cai, ed., DIMACS Series in Discrete Mathematics and Theoretical 
Computer Science 13, American Mathematical Society, 183-193. 

15. Tani, S., Hamaguchi, K. eind Yajima, S. (1993). The complexity of the optimal 
variable ordering problems of shaured binary decision diagrams. In Proc. of 4th 
International Symposium on Algorithms and Computation, LNCS 762, 389-398. 

16. Wegener, I. (1988). On the complexity of branching programs and decision trees for 
clique functions. Journal of the Association for Computing Machinery 35, 461-471. 

17. Zak, S. (1984). An exponential lower bound for one-time-only branching programs. 
In Proc. of Mathematical Foundations of Computer Science, LNCS 176, 562-566. 




Efficient Strongly Universal and Optimally 
Universal Hashing 
(Extended Abstract) 



Philipp Woelfel 

Lehrstuhl Informatik 2, Universitat Dortmund, 44221 Dortmund, Germany 
woelfeKILs2.cs.uni-dortinund.de 



Abstract. New hash families are analyzed, mainly consisting of the 
hash functions 

ha,b : {0, . . . , u — 1} {0, . . . ,r — 1}, x ((ax + b) mod(fer)) div As. 

Universal classes of such functions have already been investigated in [5, 
6], and used in several applications, e.g. [3,9]. The new constructions 
which are introduced here, improve in several ways upon the former re- 
sults. Some of them achieve a smeJler universality parameter, i.e., two 
keys collide under a randomly chosen function with a smaller probabil- 
ity. In fact, an optimally universal hash class is presented, which means 
that the universality parameter aw:hieves the minimum possible value. 
Furthermore, the bound of the universality parameter of a known, al- 
most strongly universal hash family is improved, and it is shown how 
to reduce the size of a known class, retaining its properties. Finally, a 
new composition technique for constructing hash classes for longer keys 
is presented. Its application leads to efficient hash families which consist 
of linear functions over the ring of polynomials over Zm ■ 



1 Introduction 

Since its introduction by Carter and Wegman [4,18], the concept of universal 
hashing has proven to be very successful. A large number of applications, partly 
in areas which are remote from the origincil problem, axe known. They range from 
complexity theoretical investigations over message authentication to standard 
applications like dictionary implementations or integer sorting. 

Let two finite sets U (universe) and R (range) be given, as well as a family 
H of hash functions, which map U to R. H is called e-almost universal (s-AU), 
if two keys xi ^ x-x from U collide under a randomly chosen hash function with 
a probability of at most e. It is called e-almost strongly universal (e-ASU), if 
the probability that xi , xx are hashed to arbitrary fixed values yi,y 2 & R at 
most e (for precise definitions see Section 2.1.) Applications for e-AU classes 
can be found e.g. in [4, 10, 8, 6, 3, 12], and Wigderson [19] gives a bibliography of 
36 papers concerning the topic of pairwise independence, i.e., strongly universal 
hashing. In recent years, a new type of hash family was considered. The so called 
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e-A universal families are a type of e-AU families, which were mainly used for 
message authentication [11, 13] or the construction of e-ASU classes [17]. 

There are several important properties of hash families. They should be easy 
to implement, and their cardinality should be as small as possible, since it de- 
termines the amount of random bits that me needed to choose a function. Fur- 
thermore, the universality parameter e has great influence on the performance of 
the underlying application. E.g. in the dynamic dictionary described in [8], hash 
functions are chosen at random from cin e-AU hash class, until they satisfy cer- 
tain properties. The expected number of hash functions that have to be tested, 
highly depends upon the universality parameter e. Also the expected memory 
consumption is influenced by this parameter, because the sizes of the hash tables 
increase, if “bad” hash functions are chosen. 

The time needed to evaluate functions from a universal hash class is another 
important factor. Experiments in [7] have shown that in the mentioned dictionary 
implementation, a large amount of computing time is used for the evaluation 
of hcish functions. We therefore informally say, a hash class is efficient if its 
functions can be evaluated efficiently. 

While many constructions of AU, ASU and A universal hash classes are 
known, they differ very much in how far they satisfy the above properties. Some 
families, as those mainly used in message authentication, achieve only a con- 
stant universality parameter (e.g. e = 2“®^). For many other apphcations this 
is not enough. They require c/r- AU or c/r^-ASU hash classes, where c is a 
constant to be minimized. Most known families achieving such bounds require 
prime numbers of size jC/j, or arithmetic in finite fields, while others need ma- 
trix multiplication or convolution over some finite field. Such solutions are either 
inefficient, or introduce the problem of providing prime numbers or irreducible 
polynomials. As discussed in [12, 2, 1], this may be inconvenient and inefficient for 
applications (e.g. dictionaries or integer sorting), where the size of the universe 
is determined at nmtime. 

To overcome these restrictions, fimctions of the type ha,b : {0, . . . , u — 1} 

{0, . . . , r - 1}, r ((ar -t- 6) mod(A:r)) div k were used. The so called “multi- 
plicative class” [6] is 2/r-AU, and consists of functions ha,o, where 0 < a < u 
is odd and u = kr is a power of 2. In [5], the concept was generalized. The 
“linear class” consists of functions ha,b for 0 < o, 6 < kr and fc > u — 1. It was 
shown, that this fzunily is l/r^-ASU for u, k eind r being powers of the same 
prime and 5/(4r^)-ASU otherwise. Both classes have already been found useful 
in applications, such as [3,9], and experimental studies in [7] have shown that 
the multiplicative class performs well in dynamic dictionary implementations. 

In this paper, we construct severed new hash families with improved prop- 
erties, mainly consisting of the functions ha,b- Among them are e-A universal 
and e-ASU classes for small e, which have significantly smaller cardinality than 
the linear class. Also, a new proof yields an improved bound for the universality 
parameter of the hnear class, in the case that u, k and r are not powers of the 
same prime (Section 3). New constructions of AU hash classes are presented 
in Section 4. The probably most important results are an 1/r-AU and an op- 
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timally universal hash class. For the case, that kr is a power of 2, there are 
no other constructions of this kind known, which are comparable in efficiency, 
without sacrificing a reasonable size. Actually, there has been mentioned only 
one direct construction of an optimally universal hash class in literatinre, that 
is suitable for implementation. It requires though, arithmetic over finite fields 
[16]. Section 5 considers linear functions over polynomials. The resulting hash 
classes have properties similar to those given in the Sections 3 and 4, and are 
very suitable for hashing longer keys without the need for implementing long 
integer arithmetic. 

2 Preliminaries 

2.1 Definitions 

Let U and R be two finite sets with u ~ |f/|, r := |ii| and 1 < r < u. A 
function h: U — > il is said to be a hash function with universe U and range R. 
Various types of famihes of hash fimctions have been studied. For the following 
definitions of the most important types, we consider Ff to be a (multi-)set of 
functions from U to R and h to be chosen randomly from H (according to the 
uniform distribution on H). 

1. H is e-almost universal (short: e-AU), if Prob [h{xi) = h{x 2 )) < £ for all 
xi ^ X 2 in U. An 1/r-AU class is also called universal, and an e-AU class 
with e = (u - r)/{ur — r) is called optimally universal. 

2. H is £-almost strongly universal (short: e-ASU), if for all S R and 

xi ^ X 2 in U the following two conditions are satisfied: 

(a) Prob (h{xi) = yi) - 1/r. 

(b) Prob (h{xi) - yi /\ h(x 2 ) = yz) < £• 

If H is 1/r^-ASU, it is also called strongly universal (short: SU). 

3. Let R be an abelian group. H is e-A universal (short: e-A\}), if 
Prob (h{x 2 ) - h{xi) = d) < e for ail d E R and x\ ^ X 2 in U. If H is 
l/r-A\J, it is also called A universal (short: ^U). 

The term optimally universal was introduced in [14], where it was shown that for 
any class H of hash functions, there exist two keys ^ X 2 in U, which collide 
under at least \H\{u — r)/{ur — r) functions from H. It is also well known, that 
for ASU and Z\U classes, universality pareimeters of 1/r^ and 1/r respectively 
are the best possible. 

2.2 Notation 

In the rest of this paper, we will consider R to be the additive group modulo r 
and U to be the u-element subset {0, . . . , u — 1} of the ring M = Zm , where m = 
kr >u for some integer k. If not stated otherwise, additions and multiphcations 
between two elements from M are over this ring. An addition over the group R 
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will be denoted by 0. Furthermore, for a; € M we define Mx = {aa; | a £ M}. 
Finally, for two integers x, y we let gcd(a;, y) be the greatest common divisor of 
X and y over Z. 

For a,b £ M define the mappings 

Qb : M R, a: i-> (a; + 6) div k and ha,b -U R, x gb(ax). 

In other words, ha,b{x) could be written as ((ax + b) modm) div k. For multi- 
sets A,BC M, the hash class 'Hk,A,B consists of the functions ha,b with a £ A 
and b £ B. 

Note that the functions ha,b can be evaluated very efficiently on ordinary 
computer architectures. Particularly, if m = A;r is a power of 2, then the modulo 
operation and the division can be replaced by a bitwise “and” and a bitwise 
shift. In this case, the time needed for the evaluation of the functions ha, 6 is 
very close to the time needed for one multiplication. Another slight reduction 
of the evaluation time can be achieved by choosing to be 0 for all functions, 
since then no addition is required. Generally though, this leads to an increase of 
the universality parameter. Since in most applications the size of a hash table 
can be increased by a factor of at most 2 without much effort, having m 2" is 
probably the most important case. In this case the functions ho, 6 can be evalu- 
ated in O(nlognloglogn) steps on Turing machines and in depth O(logn) and 
size 0(n log n log log n) by circuits with fan-in 2, using the Schonhage-Strassen 
multiplication method [15]. As discussed in [5], circuits for hash classes that in- 
volve prime number arithmetic or finite field arithmetic lack uniformity and are 
larger. 

3 AIJ and SU Hashing 

We first state a lemma which is crucial for the proofs in the next two sections. 

Lemma 1. Letd £ M andx\,X 2 £ U withb = X 2 ~Xi ^ 0 and'^ = |gcd(5,m)|. 
Then 



ifd£MS. 
otherwise. 

Proof. Observing that MS is the principal ideal generated by 5, (a) follows from 
basic algebra. For the proof of (b), consider the mapping ipg : M MS, a aS. 
Since (pg is a surjective group homomorphism, the number oi a £ M satisfying 
aS = d is 0 if d ^ MS, and otherwise is [v’j ^(d)| = |M|/|M(S1 =7. □ 



(a) MS = {17 1 0 < i < m/7} 

(b) Pr^(ox2 - = d) = Prob (ad = d) - 



3.1 Homogeneous Functions 

We now consider the e — 2lU hash families for A; > u — 1 . They consist of 

the frmctions ha.o and can - as noted in the introduction - be evaluated without 
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addition. I.e., ha,o{x) — ((ax) modm) div A:. As we show below, the universality 
parameter of 'Hk,M,{0} is 2 /r if fcr is a prime power and 3 /r otherwise. 

For the latter case, it can be bounded more precisely as follows. Let Fu,r,k — 
max { 0 , 7 = gcd(x, kr) | 0 < x < u, 7 1 fe}. For k>u — l, which is the case to be 
considered here, we have 0 < ru,r,k/l^ < 1 - Moreover, Fu.r.fc/A: converges to 0 
with increasing k, and if kr is a prime power, this term equals 0 in any case. 

Theorem 1. 

1 . If k > u — 1 , then 'Hk,M,{o} c/r-AU, where c = 2 + ru^T,k/k < 3 . 

2 . If m — kr is a power of the prime p and k>u/p, then 'Hk,M,{o} is 2 /r-AU. 

Proof. Let d £ R and Xi,X2 6 U with 5 = X2 — xi ^ 0 and 7 = |gcd(( 5 , m)| 
(obviously 7 < A:). For arbitrary yi,y2 G M, it is clear that 50(1/2) — goiVi) — d 
requires k{d — 1 ) < 2/2 — 3/i < k{d + 1 ). Therefore, one obtains by Lemma 1 that 

Prob (50(0x2) -50 (aa;i) = d) < ^ 7/m < r2A:/7] • — . 

y€MS 

k(d—l)<y<k{d+l) 

This is at most (2 + ru,r,k/k)/r. □ 

We now use this result to construct an e/r-ASU hash class with e as given 
in the above theorem. The following construction method is described in [ 17 ]. 
Consider a hash family H of functions U R, where {R, ©) is an abelian group. 
For any h £ H and any y £ Rlet (ph,y -U i? be defined by iph,y{x) = h{x)®y. 
It is easy to see that if H is e-AM, then the family of functions iph,y for [h, y) £ 
HxR \s e/r-ASU. 

Corollary 1. Let B — {ik | 0 < i < r}. 

1 . If k > u - I, then Hk,M,B is c/r'^-ASU, where c — 2 -h Pu,r,k/k < 3 . 

2 . Ifm = krisa power of the prime p and k > u/p, then 'Hk,M,B is 2 /r^-ASU. 

Proof ha,ki(x) = ha,o{x) © i. □ 



3.2 Inhomogeneous Functions 

Recall that the e-AM hash class from Theorem 1 consists of functions hafi, which 
can be evaluated without addition. It can be expected though, that on most com- 
puter architectures the time for an extra ciddition is very small compared to the 
time needed for the multiplication and the divisions. We will therefore increase 
the class by adding functions ha,b with 6 0, to obtain better universahty pa- 

rameters. The resulting class is 9 / (8r)-ZlU, and in the prime power case even 
\/r-AM. 

The universality parameter e can be described more precisely by the following 
term. We obtain e — 1 /r -t- dkiFu.r.k)/''’, where for 0 < 7 

fo 

^fc(7) — { 1 

U[A:/7j • (LA:/7j + 1) 



if 7 = 0. 
otherwise. 
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Although this may look technical, the important properties cam be described 
easily. For A; > u — 1, we obtain 'dk{r^,r,k) < 1/8. Further, since is at most 
u - 1 , we find that is in 0 ( 1 //:^), and therefore by an increase of k 

converges quite fast to 0 . 

Lemma 2. For 1 < i < n let {yi,y'i) be pairs of elements from M where the 
Si — Ui — y'. are pairwise distinct and form a set {z'y + C \ 0 < 2 <m/ 7 } for 
some C € M and some 'y < k. Then for any d G R and randomly chosen 
l<i<n, 0<b<k, 



Prob {gb{yi) - 56(2/0 = d) < 



\fr, if 'y divides k. 

(l + ( 7 )) /r, otherwise. 



Proof. Let 1 < j < n. It can easily be shown that the number of 6 € 
{0, ... ,fc — 1} satisfying gtiyj) - gtiy'j) = d is exactly (J^modA: if Sj € 
{k{d — 1),... ,A:d— 1} and k - Sj mod k if Sj £ {kd, . . . , k{d + 1) — 1}. Oth- 
erwise, no b G B will satisfy this condition. 

Now assume w.l.o.g. that those <5i in {A:(d — 1), . . . , A:d — 1} are < . . . < 
and those in {kd, ... , k(d 1) - 1} are 5t+\ < ... < 5t+f- If we let 



t t' 

S = ^ Sj mod A: -f ^ (A: - Sj mod k), 
j=l j=i+l 

then the probability of gb{yi) - gb{y'i) = d is S/k ■ 7 /m = 1/r • Sj/k^. In 
order to bound the sum S, we have to consider three cases, namely t = t', 
t = t' — 1 — [A:/ 7 j and t = t' + 1 = [A:/ 7 ]. 

First, assume t — t', and note that 7 IA: implies this case. Using the fact that 
JjmodA: — Jj+j-modA: = A: — f 7 (1 < j < t), we obtain S'y/k^ — 2t{'y/k) — 
f^( 7 ^/A:^). And since the real function F{x) = Ax — rx^ has a global maximum 
with value A^/(4r), we have S'y/k'^ < (2f)^/(4t^) = 1. 

Now assume t = t' — I = [A:/ 7 j. We obtain 

t 

S = ^ (fy- mod k + k — dt+j+i mod A:) -|- A: — 5t+i mod k 

j=i 

< k + t(2k - -y(t + 1)), 

thus Sj/k"^ < {2t + l){'y/k) - t{t -I- l){'y/k)'^. For the same reason as above, this 
is at most 1 -|- l/( 4 [A:/ 7 j([A:/ 7 j + 1 )). 

The last case, t = t' + 1 = follows firom symmetry reasons. □ 

The functions ka^b with a G M and 6 e {0, . . . , A: — 1} now give the desired 
e-A\5 class. The proof is omitted here, since it mainly is the combination of 
Lemma 1 and Lemma 2. 



Theorem 2. Let m = kr and B = {0, . . . , A: - 1}. 
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1. If k > u - then %k,M,B is c/r-AU, where c = 1 + i?fc(r'u,r,fc) < 9/8. 

2. If m is a power of the prime p and k > u/p, then 'Hk,M,B is AU. 

Using the techniques from Section 3.1, it is now an easy task to construct a 
9/(8r^)-ASU hash class. The resulting construction is the “linear class”, which 
was first investigated by Dietzfelbinger [5]. He showed that it is 5/(4r^)-ASU and 
even SU for m being a prime power. Oiur bound therefore is somewhat tighter 
in the case of m being no prime power. 

Corollary 2. Ifk > u—l, then'HkMM is c/r'^-ASU, where c — l+'dkiru,r,k) < 
9/8. 

For the prime power case, we present now a new SU family, which has a 
smaller cardinality than the “linear class”. While for the above construction, 
about log(u) + log(r) random bits are necessary to choose the parameter b, 
approximately log(u)/2 random bits can be spared by using the following hash 
family. 

Theorem 3. Let r be a power of the prime p and k = p^ >u — l. 

1. If B = I 0 < i then Hk,M,B is AU. 

2. IfB= {ipr'^/21 I 0 < i < %kM,B is SU. 

The somewhat lengthy proof will be given in the full version of this paper. 



4 Universal and Optimally Universal Hashing 

4.1 Universal Hashing 

Clearly, any e-AU hash family is also e-AU. In many situations though, one is 
only interested in finding hash classes where two arbitrary keys collide with low 
probability. The much stronger property of e-A universality is often unnecessary. 
We are therefore interested in finding AU hash classes, that sacrifice the ASU 
or All property for being more efficient aind/or smaller in cardinality. 

For the following constructions we will choose k = u/r. The classes are 
smaller than those from the former section, Emd can in many situations be eval- 
uated more efficiently. Assume that a computer word consists of N bits, and that 
we have u — 2^ and r = 2^ for some N' < N. On most processors, a multi- 
plication modulo 2^ can be evaluated much faster than modulo 2^+^ , so that 
having k = u/r (compared to k ^ u) leads to an improvement in efficiency. We 
start again with a homogeneous version, allowing the functions to be computed 
without addition. 

Theorem 4. Let m — kr > u be a power of the prime p, and A = 
{ip 4- 1 I 0 < i < m/p}. Then 'Hk,A,{o] is 2/r-AU. 
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Proof. Let Xi,X2 € U with S = X2 — xi ^ 0 . If S is a multiple of k, then 
k < aS < m — k for any a £ A, thus xi and X2 do not collide at all. Assume 
therefore that S is not a multiple of k, thus fc is a multiple of 7 = |gcd(p< 5 , m)| = 
p- |gcd(( 5 , m/p)\. Then obviously AS = {17 + <5 | 0 < i < m/7}. So, similar to the 
proof of Theorem 1 , one obtains 

Prob (50(0x2) = 9o{axi)) < ^ 7/m < r2A:/7l • 

y€AS 

—k<y<k 



which is at most 2/r. □ 

For m a power of 2 , almost the same hash ftunily was established in [6], and called 
the multiplicative class. Our construction generalizes the concept for arbitrary 
prime powers m. Note also, that if kr is a power of 2 , it is is possible to use 
only half the functions. It can be proven, that in this case a can be chosen 
randomly from the set of odd numbers { 1 , 3 ,... ,m /4 — 1 } without increasing 
the imiversality parameter. 

The construction of the imiversal class which we present now, is com- 

pletely new. Actually, no other universal class is known which is of reasonable 
cardinahty and as efficient as ours for kr being a power of 2. 

Theorem 5. Let r be a power of the prime p, and k = p^ > u/r. Farther- 
more, let A = {ip+ 1 I 0 < i < m/p} and B = | 0 < i < 

Then ;= %k,A,B is universal. 

More precisely, any x\ ^ X2 mU collide under a randomly chosen h £ 
with the same probability 1/r, if |gcd(x2 — xi,m)| < fc. Otherwise, they do not 
collide at all. We omit the proof, which is very similar to the proofs of Theorem 3 
and 4 . 



4.2 Optimally Universal Hashing 

As we said before, some specific keys do not collide under any function from 
?funiv ^jjg following construction is to add other functions, which 

let exactly such keys collide. This leads to Ein improvement in the universality 
parameter. In fact, for u being a power of r (which itself is a prime power), the 
universality parameter tahes its minimum possible VEilue, i.e., the class is opti- 
mally universal. Note that there is an equivalence between optimally universal 
hash classes and resolvable balanced incomplete block designs (RBIBDs) — see 
[ 16 ]. Although many existence results of RBIBDs are known, not much attention 
has been paid in the literature, on how to efficiently evaluate them. Likewise, no 
very practical constructions of optimally universal hash classes have been found, 
yet. The hash family we present now, fills in the gap, since it is of reasonable 
size and its functions are easy to implement and efficient to evaluate. The proof 
is omitted due to space limits. 
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Theorem 6. Let r be a power of the prime p, m — r*' >u for some integer t and 
k = p^ — Further, let A = {(ip + 1) • | 0 < j < t, 0 < i < m/{pr^)} 

and B = | 0 < i < Then for 71°^^ := 7ik,A,B the following 

holds: 

1. Any two distinct keys collide with the same probability (m— r)/(mr— r) < 1/r 
under a randomly chosen function. 

2. If m = u, then 71°^* is optimally universal. 

Assuming that m = u — r*, for 71°^*' we have |A| = (u/p) ■ (r* — l)/(r‘ — 
which is less than (u/p) • r/(r - 1). So, this class is not much larger than 
where |A| = u/p. Since the evaluation of the functions from both classes is 
equally efficient (provided that the word length for the multiplication is equal), 
may be preferred in many situations. In others though, namely those, where 
the smallest r* >u is much larger than u, might be the better choice. 

5 Hashing Polynomials 

In some situations, it is necessary to hash keys, which do not fit into a single 
computer word. In such cases, the evzJuation of the functions ha,b may ba in- 
convenient and inefficient, since multiplication of long integers would have to be 
implemented. If an e~A\J hash class H with universe U and range R is given, 
then the following construction allows us to hash v words from U to p words 
from R with a universality parameter of e^. 

Proposition 1. Let H be an s-AU family of hash functions U R. Then for 
any two integers 1 < p < u, there exists an e'^-AU family of hash 

functions U'^ Rp. 

Proof. For any h = {hi,... ,hv+p-i) € let RP map 

(xi,... ,Xv) to (yi,... ,yp), where yt = J2^=ihk+j-i{xj). We show that the 
class consisting of all functions where h € H^'^p~^, is eP-A\J. Consider two 
distinct keys x = (xi, . . . , x„),x^ = (x'j, ... ,x(,) 6 U^, and let t be the highest 
index with xt ^ x[. Furthermore, let (di,... ,dp) — ^h{g^) — ^h{x) for some 
randomly chosen h = {hi, . . . ,h^+p-i), and let 1 < i < p. By the assumption, 
hi+t-i{xt) — hi+t-i{x't) takes any vedue in R with a probability of at most e, 
and therefore so does di. Observing that di, . . . ,dj_i are independent from the 
choice of hi+t-i^ the result follows immediately by induction over i. □ 

The direct apphcation of the construction method to the hash classes from the 
former sections would already yield and - with the method described in 
Section 3 - ASU families. A slight modification of this technique though, will in 
our case lead to somewhat more efficient constructions. 

Let the universe U — U'’ with U = {0, . . . , u — 1} and the range TZ = Rp 
with R — {0, . . . , r — 1}. As before, let m = kr > u for some integer k and 
consider all operations over the ring M = Zm- Further, let n > u be some 
fixed integer (to be determined later). For a = {ao, . .. , Un-i) G Af” cmd x = 
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(xq, . . . ,x„_i) € M^, we define the convolution of a and x, a*x, to be the 
n-element vector y = (t/oi--- ,2/n-i) G where yi = 

our hash functions, only the last p coordinates of the convolution need to be 
computed. For o € M" and b = {bo,- ■■ , bp-i) € M^, let 

56 : M" -> M^, (xo,... ,x„_i) {yo,--- ,2/p-i), 

where yi — {xi+n-p + &t) div k and 

ha,b :U ^U, XM- g^{a * x). 

Note that the functions ha^t are hnear fimctions over the ring of polynomials 
over M of degree less than n. 

The hash classes ^ specific A Q M" and B C consist of the 
hash functions ha, 6 for (a, b) Q A x B. Varying the parameters k, A and B, we 
get similar to the former sections severed ASU, AU and AU hash families. We 
summarize all results in the following theorem. 

Theorem 7. Let k > u/p, if r is a power of the prime p and k > u— 1 otherwise. 
Let further Ci = (2 + F,t^r,k/h)^ and C 2 = (l + '&k{ru,r,k)Y (recall that Cj < 3^ 
and C 2 < (9/8)^, and if m = kr is a prime power, then c\ = 2^ and = Ij- 
Then the following holds: 

1. Let n = u 4- p - 1 and A = M". 

(a) IfB = {0}^, then U^Xb cifrP-AU. 

(b) If B = {<3,... ,k- l)Xthen HIXb c^fr^-AU. 

(c) IfB = {ik\ 0<i< rY, then U^Xb ci/r^+'^-ASU. 

(d) IfB = MP, then nlX,B C 2 /rP+^ -ASU. 

2. Let u = r, n = V and A — {(fe,ai, . . . ,a„_i) | ai, . . . , a„_i £ M}. 

(a) IfB= {0}'’, then H^Xb ci/rP-AU. 

(b) IfB^{0,...,k- ly’then nlX,B ^ C 2 /rP-AU. 

Sketch of Proof. 1 (a) and (b) are proved by em induction like that in the proof 
of Proposition 1, but using the techniques from Theorem 1 and Lemma 2 respec- 
tively. For (c) and (d), recall the construction method described before Corol- 
lary 1. Finally, 2 (a) and (b) are obvious by considering the two cases that either 
two given vectors (keys) from the universe differ only in the p highest coordinates 
or also in lower ones. 

Note that the AU classes require the words of the range and the universe to 
have the same length, i.e., U — R. 
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Abstract. We study the problem of page replication in ring networks. 

The goal is to determine a set of nodes which should contain a page of 
read-only data in their local memories so that the total cost of accessing 
data is lowest possible. We prove a lower bound on the competitive ratio 
of any deterministic on-line algorithm in large uniform rings which ap- 
proaches 2.31023 cis the page size amd the number of nodes go to infinity. 

We present a (3 -h V^)/2 « 2.36603-competitive deterministic on-line 
algorithm for the 4-node uniform ring. We also show a matching lower 
bound for any deterministic on-line algorithm in this topology. Our re- 
sults disprove the conjecture of Blaick and Sleator (1989) for the lower 
bound of 2.5. 

1 Introduction 

A common approach for providing global shared memory in a network of pro- 
cessors is to distribute physical pages among the local memories of particular 
nodes. A node which often needs to read data stored at another node may re- 
duce its overall access time by storing a copy of the relevant page in its local 
memory. In the replication problem one has to decide which subset of the nodes 
should hold a copy of the page in order to minimize the total access cost. We are 
interested in on-line algorithms which serve every request as soon as it occurs. 
The decision is based only on the current and past requests and no assump- 
tion about the distribution and location of future requests is made. We study 
on-line algorithms using competitive analysis [7, 11]. In competitive analysis the 
cost of an on-line algorithm A on a request sequence ex, denoted as Ca(<t), is 
compared to Copt(<^)j the cost of the optimal off-line algorithm OPT on a. A 
deterministic on-line algorithm A is c-competitive if, for all request sequences 
<^,Cx{cr) < c-Copt{(t). 

The replication problem has been intensively studied in the on-line settings 
[1-3, 5, 8, 9]. In its simplest two nodes version it corresponds to the ski-rental 
problem [10]. It was shown by Bartal et al. [2] that on general graphs no deter- 
ministic or randomized algorithm can achieve a competitive ratio smaller than 
f2(logn), where n is the number of nodes in the network. On the other hand, for 
certain network topologies such as trees and complete graphs strongly competi- 
tive on-line algorithms have been obtained [1,3,9]. There still remain a number 
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of open problems with respect to ring, a network which is important in both 
theory and practice of distributed computing [6, 12]. Deterministic and random- 
ized on-line algorithms were proposed for arbitrary and uniform rings [1,5,8]. 
Currently, the best deterministic upper bound is 3 in uniform [5] and 4 in ar- 
bitrary rings [1]. As far as the lower bound is concerned. Black and Sleator [3] 
conjectured that no deterministic on-line algorithm for rings can be better than 
2.5-competitive, even on a 4-node ring. Apart from this claim no lower bounds 
for the ring topology have been demonstrated so far. 

In this paper we continue our earlier study on the replication problem in 
uniform rings [5]. We propose two new lower bounds and one new upper bound 
for this problem. In particular, we prove a lower bound on the competitive ratio 
of any deterministic algorithm in large rings which tends to 2.31023 as the page 
size and the number of nodes go to infinity. At the other end of the scale we study 
the replication problem in the 4-node ring which is the smallest non-trivial case of 
this problem. We present a « 2.36603 competitive deterministic algorithm 
and a matching lower bound against any deterministic on-line algorithm in this 
topology. Our results disprove the conjecture of Black and Sleator [3]. We note 
that the lower bound for large rings 2 md the upper bound for 4-node rings have 
been obtained independently in [4]. 

2 Problem Statement 

Let Cn be the ring of circumference n with n equally spaced nodes. Note that all 
edges of C„ are equal 1. A ring of this type is CEilled uniform. Let Vi,V2, - ■ • , Un 
be the ordering of nodes if we scan the ring C„ in the clockwise direction starting 
from selected node V \ . We denote by (u, w) the arc that is obtained if we start 
at node u and go to node w in the clockwise direction. Let l{u, w) be the length 
of arc {u,w). 

We say that a node v has the page if the page is contained in its local memory. 
We assume that initially only node vi hcis the page. A request at a node v occms 
if V needs to access the page. If v has the page then the request can be satisfied 
with zero cost. Otherwise the request is served by accessing a node w holding 
the page and incurring cost equal to the distance from v to w. After the request 
is served the page may be replicated from node w holding the page to any node 
v' which does not have it at a cost of d times the distamce between w and v' (v 
and v' may coincide). Symbol d stands for the page size factor. The page may 
be replicated only after a request is served. Following [3] we assume that if an 
algorithm replicates the page from node w to v, then the page is also replicated 
with no extra cost to all nodes on the path from w to v. The right (resp. left) 
boundary node of algorithm A at some time is the endmost node to which the 
page has been replicated clockwise (resp. counterclockwise) in A from vi before 
this time. The nodecover of algorithm A is the part of the ring contained between 
the left and the right boundary nodes where all nodes have the page. 
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3 Lower Bound for the Ring C„ 

Let a — (Ti< 72 . . . cr^ be the sequence of requests to be served on-line and let 
(jJ = <tii7'2 . . . Oj for every j < m. Let A = 0.567. 

Theorem 1. For large even n, no deterministic on-line algorithm for the ring 
Cn is better than B -competitive, where 

Proof. Suppose, to the contrary, there is some deterministic on-line algorithm A 
whose competitive ratio c is less than B. We shall demonstrate that irrespective 
of algorithm’s policy its competitive ratio is at least B, which contradicts om 
assumption on A. Without loss of generahty we normalize ring’s circumference 
to be equal 2. Let w be the node opposing vi in the ring, i.e., l{v\,w) = 1. The 
adversary’s requesting strategy is as follows. First, the adversary issues requests 
at node w until algorithm A replicates the page to w. If at that moment all 
nodes of the ring are in the nodecover of A, then the adversciry ends the request 
sequence. Otherwise, the adversary issues more requests inside the uncovered 
part of the ring. The sequence ends when algorithm A covers all nodes in the 
ring. Moreover, the adversary constantly watches how A serves every request. If, 
at some request, the ratio of A’s cost to OPT cost reaches bound B, the sequence 
is finished immediately. If it does not occur, then eventually the adversary forces 
A to cover all nodes in the ring, whereby A incurs cost of serving all requests 
plus cost of replicating the page around the ring. On the other haind, adversary’s 
cost is at most that of replicating the page over one semiring. 

Let us consider costs incurred by the adversary and the algorithm on input a. 
Let cr*’ = (Ticr 2 . . . (Tfc be the part of the request sequence issued at node w which 
precede replication of A to node w. Knowing the whole sequence, the adversary 
may always choose to replicate the page prior to serving any requests. Thus, 
similarly as in the 2-node graph Ccise [3], the adversary’s cost on is 

Copt(o’*) = min(fc, d) • l{vi, w) = min(fe, d) . (1) 

In order to bound the cost incurred by A while serving we need to intro- 
duce some additional notation. By /3j,0 < A < 1, we denote part of the ring’s 
circumference contained in the nodecover of A after requests a' were serviced. 
Furthermore, let Oj (resp. 1 — Oj) be the part of the A’s nodecover 2/3, “grown” 
clockwise (resp. counterclockwise) from vi. By symmetry of the ring we may 
assume, without loss of generality, that algorithm serves tr*‘ clockwise which im- 
plies that Oi > 1 - Oi holds, i.e., | < aj < 1. Now, we claim that for each 
l,l<l<k, 
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We prove (2) by induction on the number of requests 1. For I = 1 both sides of 
(2) equal 1 + 2/3id, where 1 is the cost incmred by A serving request ai over the 
distance of l{vi,w) = 1 and 2/3id is the cost of replication over part /3i of the 
ring incurred by A cifter serving <ti. Assume that (2) holds for I — 1 < k. The 
adversary’s cost is Copt(< 7*“^) = min(l — 1, d). In addition, by the assumption 
on A the following constraint 

< c • (3) 

holds. The adversary issues request which is serviced by A from its closest 
boimdary node at a cost of 1 — Thus, by (1), (2), and (3) 



1 - 2ai_i/3j_i > 



1 - 



ai-i 






{ 1-2 1-2 
t=i j=t 



{ 1 - 2 ) 



-l-(c-l) 



ai-i 



1 - 2 1-2 






+ 1 



( 4 ) 



i=l j=i 



Additionally, after serving request ai algorithm A may replicate the page, ex- 
tending its nodecover from 2/3i_i to 2(3i (if it does not, then /?;_! = /3j). Hence, 
by (4) and inductive hypothesis, the total cost of A on cr' is 



Ca{(t^) = Ca{<7^-^) + (1 - 2a,-i0i-i) + 2d{0i - 0i-i) 

>(/_ 1) _ (c- 1) (l + ^)_ (1-2) 1+2/3, _id 



+ l-(c-l) 



ai-i 

d 

i-i (-1 



i=l }=i 
1-2 1-2 






i=l j=i 



+ 2d{0i — /3,_i) 



i=l j=i 



After A served a*' its nodecover reaches node w. At that moment 2ak0k = 
l{vi,w) — 1, so the total length of A’s nodecover is 2/3fc = ^. As before, the 
constraint 



Ca((t'')<c-Copt(^'') (6) 

must hold. Now, we use (1) and (2) to rewrite (6) and obtain lower bound for 
the competitive ratio c on a*' 






-1 



( 7 ) 
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Note that for each k > d opt cost equals whereas a’s cost 

C'a((T*‘) is increasing in k, so the competitive ratio may only grow for such fc’s. 
Thus, we may assume without loss of generality that A chooses k < d. Moreover, 
A chooses each at so that its cost is the smallest possible. For given k cost 
(2) decreases if each ai is replaced by a = maxj(aj). By setting I = k (2) can be 
rewritten as 

and (7) simplifies to 

Cl -L 

C>~ 4^ — =Ci(Q,fc) . (9) 

(1 + 1 ) -1 

If A chooses cc = I then the whole ring is already covered so the adversary ends 
the request sequence. In this case c is at least min*;<dCi(|,A;) = ci (|,d) « 
2.5415 > B, a. contradiction. In the case when algorithm A chooses a > | 
the adversary continues the request sequence and forces A to rephcate over the 
uncovered part 2 — ^ of the ring. All requests are placed at the midpoints of the 
currently uncovered part of the ring. Ultimately adversary’s cost may increase 
to at most d because the adversary may replicate the page to cover all requested 
nodes in the semiring which was not covered by A before replicating to node w. 
On the other hand, the algorithm continues to pay for serving requests until it 
has all nodes in its nodecover. This must eventually happen because otherwise 
a’s cost grows infinitely and OPT’s cost remains fixed which does not guarantee 
any constant competitive ratio. In summary, the total cost of OPT on a must 
satisfy 



C'opt(o') < k + {d — k) = d, (10) 

Ca(ct) = Ca(ct'=) + Ca((t\<7*) > CaCct*) + d (^2 - , (11) 

C'a(o') < c • Copt(o’) . (12) 



Now, we may use (10), (11) to rewrite (12) and obtain a lower bound for the 
competitive ratio on the whole request sequence a 

which can be rewritten as 



c > 



(l + ff + la-l 



= C2{a,k) . 



(13) 



Algorithm A chooses parameters a and k to minimize the competitive ratio 
maxa,*; {ci{a, k), C 2 {a, k)). In order to balance the two expressions ci(o;, fc) and 
C 2 {a,k) we set a = 1,A; = Ad. We get c > Ci(l,Ad) = C 2 (l,Ad) — B, a contra- 
diction. □ 
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Corollary 1. For large d, the lower bound B approaches 2.31023. 



4 Lower and Upper Bounds for the Ring C 4 

The strongly competitive deterministic algorithms for trees and complete graphs 
[3] are conservative in the sense they replicate the page across an edge only af- 
ter the accumulated cost of serving prior requests along this edge equals page 
size d. This strategy results in the competitive ratio of 2. Using the same ap- 
proach for rings does not seem to work well enough: it is rather easy to come 
up with a request sequence on which no algorithm of this type can be better 
than 2.5-competitive on C4 and 4-competitive on Cn for large n. The adversary 
issues requests at the midpoint of that part of the ring which the algorithm has 
not covered yet. The competitive ratio increases from 2 because the ring is a 
biconnected graph in that there are always two paths between a requesting node 
which does not have the page and a node which has it. An on-line algorithm has 
to decide which path should be used for requests service and which for replicat- 
ing the page. This makes the replication problem for rings more difficult than 
for trees or complete graphs where no such imcertainty exists. 

Algorithm Mirror [5] follows a more aggressive replication policy: it repli- 
cates the page in both semirings after the accumulated cost of serving prior 
requests along some pair of edges, one in each semiring, equals page size d. This 
technique brings down the competitive ratio on any uniform ring from 4 to 3. 
Still, it does not come close to the conjectured lower bound of Black and Sleator 
on C4. Om idea for beating the bound of 2.5 is to take yet another approach to 
replication in which the page may be copied to a node that issued less than d 
requests provided there are “enough” requests in its neighborhood. We present 
an algorithm which is 2.36603-competitive for Imge values of d. Let 

k < dhe some integer to be decided later. 

Algorithm R4. For each node 1 < f < 4, in C4 maintain a count Ui. Initially 
all counts are set to zero. If there is a request at node Vi that does not have the 
page then serve it from the closest boundary node amd increase count at Vi by 
one. Furthermore, 

(la) if ri 2 -f ri 3 = A: and n 2 + U 4 > 0 then rephcate the page to from ui; 

(lb) if 714 -I- Ti 3 = A: and 713 + 714 > 0 then replicate the page to V 4 from vi; 

(2) if 773 = A: and 773 -t- 774 = 0 then replicate the page to 773 and V 4 from vi; 

(3) if 773 = d then replicate the page to V3 from the closest boimdary node. 

Theorem 2. For large d, the competitive ratio of algorithm R4 on C 4 ap- 
proaches ■ 

Before we prove Theorem 2 we maike two simple observations concerning 
algorithm R4 and the optimal off-line adgorithm OPT for C4. 
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Lemma 1. For each node Uj,! < i < 4 , define Vi — \{crj,l < j < m : aj 
requested ot Vj}|. Then for every request sequence a 

Copt (o') = niin {min(r2 + ra, d) + min(r4, d), min(r4 + ra, d) + min(r2, d)} 

+ min(ra,d) . 

Proof. We split the cost incurred by OPT into parts corresponding to two semir- 
ings (v2, V4) and (u4, U2). An edge (arc) incmrs for a request a cost of its length if 
the path from the requested node to the closest boundary node serving it passes 
through this edge (arc), otherwise the incurred cost is zero. Edge (arc) also incurs 
a cost of replication across it. The cost in semiring (v2, 1^4) depends only on re- 
quests at node V3. The optimal off-line strategy is to replicate the page to node V3 
before responding to any requests if the number of requests ra is at least d, oth- 
erwise to serve all requests from the closest boundary node. Hence, the incurred 
cost is min(ra,d). The cost in semiring (v4,V2) depends on requests at nodes V2 
and V4 but also on the direction of serving requests at node V3. Algorithm OPT 
may choose to serve requests at V3 clockwise via node V4 or counterclockwise 
via node V2- In the first case the OPT cost is min(r4 -I- ra, d) -I- min(r2, d), in the 
second case the cost is min(r2 -t- ra,d) -I- min(r4,d). Hence, the optimal off-line 
cost on the semiring {V4,V2) is the smaller of the above two sums. □ 

Lemma 2. For every input a in C4 {nodecover OPT) C {nodecover R 4 ). 

Proof. The claim is obvious if nodecover R 4 = {ui, V2, V3, U4}. Next, consider the 
case when nodecover R 4 = {v\,V2,V4}. Prom the definition of R 4 it follows that 
ra < d and by Lemma 1 we know that in such case OPT does not rephcate the 
page to node V3 either. Another case to consider is when nodecover R 4 = {ui, U2} 
(similar analysis deals with {^1,114}). From the definition of R 4 it follows that 
^4 + ^3 < h, so ra < k,r4 < k. By Lemma 1 we know that in such case OPT does 
not replicate the page to nodes V3 and V4 either. Finally, consider the case when 
nodecover R 4 = Then ra -t-ra < k and r4-t-ra < k, so ra < k,V3 < k, r4 < k. 
Again, by Lemma 1 we know that OPT does not do any rephcations as well so 
the claim of the lemma follows. □ 

Proof. (Theorem 2). Lemma 2 imphes that once algorithm R 4 has completed 
the last replication, its cost on any subsequent requests is at most the cost of 
OPT on these requests. Hence, beginning with the first request after R 4 last 
replication, the ratio of total costs of R 4 and OPT on the whole sequence a may 
only decrease. Therefore, we may assume without loss of generality that the 
adversary finishes a after the last replication of R 4 . Note that after serving the 
last request count ni equals rj for each i. Further analysis depends on the rule 
which algorithm R 4 uses for the last replication. 

Case 1 . Rule la or lb. Without loss of generality we analyze the case for Rule la. 
We need to consider two subcases. 

Case 1 . 1 . Rule la was fired when nodecover R 4 = At that moment U2 + 
n3 = k and 714 -fna < k. Hence C'r4(o') = (ri2-|-n3-t-(i)-|-n3-bn4 = k + d+U3 + n4 
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and Copt(c’') = 2 n 3 + — k + nz-\- Ui, so the cost ratio is 1 + 

The adversary chooses ri 3 = 714 = 0 to maximize ratio c. Thus c < 1 + f • 

Case 1.2. Rule la was fired when nodecover R4 = {vi,va}, i.e., it follows ex- 
ecution of Rule lb. Let <j be the part of the request sequence a processed by 
R4 immediately before it executed Rule lb and let n2,n'^,n'^ be the numbers of 
requests issued at nodes U 2 , U 3 , V4 by that time. We have 714 - 1-713 == k, 712 - 1-713 < 
k,C^4{a') — {n'4 -I- 713 -I- d) -f 713 -f- rij = A: -I- d -I- Tig -|- 712 and CqptW) = 
2713 - 1 - 712 - 1-714 = fc-l -713 -t- 71 ' 2 . Also, at the moment when R4 fired Rule la we have 
U 2 -h 713 = k. Thus, we have Cr 4 (ct) = (714 -|- 713 - 1 - d) -|- 713 -H (712 -H d) = 2A; -f 2d 
and Copt (o') = 2713 - 1 - 772 - 1-774 — k + Ti3-\- n'4 = 2k + (773 — 773 ). The cost ratio 
is 2k+(n3-n ' ) ' adversary chooses 773 = 773 to maximize this ratio. Therefore 
c<l + f' 

Case 2. Rule 2. From the definition of R4 it follows that Rule 2 is fired only if 
ns = k and 712 = 774 = 0 , so that at the moment of this rephcation nodecover 
R4 = {ui}. We have Cr 4 (<t) = 2A; -I- 2d, Copt(o') = 2A:. Hence c <1 + 

Case 3. Rule 3. Prom the definition of R4 it follows that if Rule 3 is fired then 
either Rule 2 or both Rules la and lb must have been fired before that time. 
Hence, if R4 replicates the page to node V 3 then it has already replicated to nodes 
V2 and V 4 and nodecover R4 = {ui,U 2 )^^ 4 }- We need to consider two subcases. 
Case 3.1. Rule 3 follows Rule 2. In that case R4 cost increases from 2A: 4 - 2d 
to CR 4 (<r) = (fc -H d) - 1 - 2d -f d = A: - 1 - 4d, whereas opt cost increases from 2A: 
to Copt(<^) = 2d. Hence c < 2 -I- 

Case 3.2. Rule 3 follows Rules la and lb. In that case R4 cost increases from 
2k -f 2d to Cr 4 ((t) = (A: -f d) -t- 2d -I- (772 + d) = A: -f 772 -f 4d, whereas OPT cost 
increases from 2A: -f (773 — 773 ) to Copt(o') = ^2 -f 2d. Hence, the cost ratio is 
2 -I- 2d^ni • Adversary chooses 772 to meiximize ratio c. Therefore we again have 
2 

Algorithm R4 chooses k which minimizes the worst case cost ratio 

for all possible request sequences a. By a beilancing argument the choice of 
k = [(v^ — l)dj guarantees 



c < max 




L(^/3-l)dJ’^^ 



l(V3-l)dJ 



2d 



(15) 



for all possible inputs cr. As d tends to infinity the competitive ratio of algorithm 
R4 approaches 1 -I- □ 

Now we demonstrate that the competitive ratio attained by algorithm R4 
can not be improved. 

Theorem 3. No deterministic on-line algorithm on C 4 can be better than - 
competitive. 
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Proof. Let A be some deterministic on-line algorithm for rings. Adversary’s strat- 
egy consists of at most three phases. In the first phase the adversary issues 
requests at node V 3 until algorithm A rephcates the page first time. Let k be 
the munber of requests which precede the first replication of A. Without loss of 
generality we may assume that k is finite because for an infinite k the cost of 
A grows to infinity which does not guarantee A any constant competitive ratio. 
The next steps of the adversary depend on the value of k and the number of 
nodes which enter the nodecover of A at the moment of A’s first replication. 
Case 1. All three nodes V 2 , 03,04 enter the nodecover of A. The adversary ends 
the request sequence. We have Ca{(t) — 2k + 3d and C'opt(<^) — 2 ■ mm{k,d). 
For every A: > 0 this yields 



c > 



2k 3d 
2 ■ min(A:, d) 



> 2.5. 



(16) 



Case 2. Two nodes 02, 03 or 03, 04 enter the nodecover of A. Depending on k, the 
adversary can either stop or continue the request sequence. In the case it stops, 
C'a(o') = 2k + 2d and Copt(o') = 2 • min(fc,d). We get 



c > 



2 * + 2 d 
2 • min(A:, d) 



(17) 



Otherwise, the adversary begins the second phase: it issues requests at the mid- 
point of the uncovered part of the ring until A replicates the page to this node. 
Let I be the number of requests issued in the second phase (again, we may as- 
sume without loss of generality that I is finite). We have Cp,,{(t) = 2k + 1 + 3d 
and C'opt(o') = min(fc, d) 4- min(A: 4- 1, d). Thus, we obtain 



c > 



2k 4" 1 4“ 3d 

min(A:, d) 4- min(fc + l,d)' 



(18) 



Algorithm A chooses k and I to balance scenarios (17)(18) over all request 
sequences a. This yields 

, f 2k 4“ 2d 2k 4- 1 4- 3d 

c > mm max < — - — - , — 7 — — r~r, ; — 

k,i ( 2 • min(K, d) min(fc, d) 4- mm(K 4- 1 , d) 

We find by calculation that the best ratio c cilgorithm A can hope for is at 
k = d,l = 0, whereby the competitive ratio is c > 2.5. 

Case 3. Two nodes 02,04 enter the nodecover of A. Again, depending on k, the 
adversary can either stop or continue the request sequence. In the case it stops 
we have C'a(ct) = 2fc 4- 2d and Copt(«’’) = 2 • min(fc, d). We get 



c > 



2 k 4“ 2 d 
2 ■ min(fe, d) ' 



(19) 



Otherwise, the adversary begins the second phase and continues requesting at 
03 imtil A replicates the page to 03 . Let j be the number of requests issued in 
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the second phase (for the same reasons as before we may assume that j is finite). 
We have C'a(o-) — 2k+j + Sd and Copt(o') = 2 • min(A: + j, d). Thus, we obtain 



c > 



2k j 3d 
2 • min(fc + j, d) 



( 20 ) 



Algorithm A chooses k and j to balance scenarios (19) (20) over all request 
sequences a. This yields 

. ( 2k + 2d 2k “l~ ?* "I" 3d 

c > mmmax ( r-r, r-ir 

fc,j ( 2 • min(K, d) 2 • mm(K + j, d) 



We find by calculation that the best ratio c algorithm A can hope for is at 
k = [d (v^ - l) J , j = d - fe, whereby the competitive ratio is c > 1 + 

Case 4 - One node V2 or V4 enters the nodecover of A. If the adversary stops after 
the first phase then we have C'a(o’) = 2fc + d, and Copt (o’) — 2-min(fc, d). Thus, 
we get 



c > 



2fc + d 
2 • min(A:, d) 



( 21 ) 



Otherwise, the adversary begins the second phase and issues requests at the 
uncovered node of the pair {u 2 ,t; 4 }. Let I be the number of requests issued 
in the second phase (we may assume that I is finite). If the adversary stops 
after the second phase then we have C\{< 7 ) = 2A: + / + 2d and Copt(o) = 
min(A:, d) + min(fe + 1, d). Thus, we obtain 



c > 



2 k “h 1 “h 2d 

min(A:, d) + min(fc + l,d)' 



( 22 ) 



Otherwise, it begins the third phase in that it returns to V 3 and issues requests 
until A replicates the page to V 3 . Let j be the number of requests issued in the 
third phase (we may assume that j is finite). We have Ca{(^) — 2k + l+j + 3d 
and Copt(o) = min(/c + j, d) + min(A: + 1 + j,d). Hence, we get 



c > 



2k I j 3d 

min(A: + j, d) + min(A: + l + j,d)' 



(23) 



Algorithm A chooses j, k and I to balance scenarios (21) (22) (23) over all request 
sequences (t whereas the adversary chooses the phase after which to stop the 
request sequence so that for given j, k, I the competitive ratio 

( 2 k -l- d 2 k + 1 “i“ 2d 

\ 2 min(fe, d) ’ min(fc, d) + min(fc + l,d)' 

2k I j 3d 

min(fc + j, d) + min(fe + 1 + j, d) 

is worst possible. We find by calculation that the best ratio c algorithm A can 
hope for is at fc = [d (-\/3 — l)J, j = d — A:, 1 = 0 (note that setting I — 0 implies 
that Case 4 reduces to Case 3). For this choice of parameters the competitive 
ratio is c > 1+ • For large d, the bound approaches ^ 



c > min max 
“ j,k,l 
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5 Conclusions 

In this paper we obtained two new deterministic lower boimds and one new 
upper bound for the replication problem in uniform rings. The lower bound of 
2.36603 for Ca matches the upper bound attained by algorithm R4. In contrast, 
the gap between the lower bound of 2.31023 for large rings and the best known 
upper bounds leaves ample space for further research. We beat the lower bound 
of 2.5 in Ca by the use of aggressive replication policy which copies the page in 
the anticipation of future requests. It would be interesting to extend this concept 
for all ring sizes. Finally, we note that our upper boimd for Ca also holds in the 
randomized case for which the first lower bound of 1.75037 has been recently 
reported [4]. 
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Abstract. Security of mobile code is a major issue in today’s global 
computing environment. When you download a program from an un- 
trusted source, how can you be sure it will not do something undesirable? 
In this paper I will discuss a particular approach to this problem called 
language-based security. In this approach, security information is derived 
from a program written in a high-level language during the compilation 
process £ind is included in the compiled object. This extra security in- 
formation can take the form of a formal proof, a type annotation, or 
some other form of certificate or annotation. It can be downloaded along 
with the object code and automatically verified before running the code 
locally, giving some assurance against certain types of failure or unau- 
thorized activity. The verifier must be trusted, but the compiler, code, 
and certificate need not be. Java bytecode verification is an example of 
this approach. I will give an overview of some recent work in this eirea, in- 
cluding a particular effort in which we are trying to make the production 
of certificates and the verification as efficient and invisible as possible. 



1 Introduction 

With the rise of the Internet, security of mobile code is emerging as one of the 
most important challenges facing computing research today. As we become more 
and more dependent on the global information infrastructure, we axe finding 
omselves increasingly vulnerable to malicious attacks and buggy software. Yet, 
even as Melissa and Happy99 wreak worldwide havoc, we continue to download 
and nm plug-in softwcire with little regeird for the consequences. 

A recent study of the Computer Science and Telecommunications Board of 
the National Research Council [30] deteiils the extent of the security problem. It 
argues that much of our criticzd infrastructure — transportation, communication, 
financial mairkets, energy distribution, health care — is becoming dangerously de- 
pendent on a computing base that is out of the purview of any single authority. 
We axe alxeady vulnerable to many forms of maUcious attack and software failure 
with potentially devastating consequences. 

The recent report of the President’s Information Technology Advisory Com- 
mittee (PITAC) [10] warns of this dependence and recommends a substantial 
increase in federally funded research: 
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We have become dangerously dependent on large software systems whose 
behavior is not well imderstood and which often fail in unpredicted ways 
. . . Our nation’s dependence on the Internet is increasing. While this is 
an exciting development, the Internet is growing well beyond the intent 
of its original designers and our ability to extend its use has created enor- 
mous challenges. As the size, capabiUty, and complexity of the Internet 
grows, it is imperative that we do the necessary research to learn how 
to build and use large, complex, highly-reliable, and secmre systems . . . 

This research will . . . protect us from catastrophic failmres of the com- 
plex systems that now underpin omr transportation, defense, business, 
finance, and healthcare infrastructures. [10] 

President Clinton and Vice President Gore responded to an interim version 
of this report with a far-reaching initiative known as IT^ in which they propose 
a $366 million, 28% increase for research in information technology as part of 
the fiscal 2000 federal budget [20]. In their words: 

As our economy and society become increasingly dependent on informa- 
tion technology, we must be able to design information systems that are 
more secure, rehable, and dependable. The software systems that lie at 
the core of worldwide financial systems, air traffic management, defense 
command and control — indeed, virtually all parts of our economy — are 
the most complex human inventions ever created. As a result, however, 
our society now faces unknown hazards both from hostile attacks on 
these systems and from the even greater threat that simple mistakes 
or system failures will bring wholesale collapse of critical systems. The 
small software failmes that have shut down large parts of the nation’s 
phone systems and air trciffic control systems and the “millennium bug” 
are examples of what can go wrong in om current environment. We do 
not know how to design and test complex software systems with millions 
of lines of code in the same way that we can verify whether a bridge or 
an airplane is safe. [20, p. 5] 

And from elsewhere in the same report: 

Active software participates in its own development Emd deployment. 

We see the first steps towards active software with “applets” that can 
be downloaded from the Internet, but this is just the beginning. Active 
software will eventually be able to update itself, monitor its progress 
toward a particular goal, discover a new capability that is needed for 
the task at hand, and safely and secmrely download the piece of software 
needed to perform that task. [20, p. 7] 

In this paper we focus on a particular approach to the security problem known 
as language-based security. We give a general overview, then we discuss several 
cmrent research projects that fcill within this framework: proof-carrying code 
(PCC) [21-26], typed assembly language (TAL) [7, 13, 14, 16], security automata 
(SASI) [6,28,29], efficient code certification (EGG) [11], and information flow 
(JFlow) [17-19]. 
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2 Some Issues in Security 

2.1 Safety Policies 

Suppose we wish to download and run a program from an unknown or untrusted 
source. Before running our downloaded program, it would be nice to have some 
assurance that the code is safe to run. Of course, “safe” is subject to interpre- 
tation and may have different meanings in different contexts. The definition of 
“safe” used in a particular application is that application’s safety policy. For 
example, we may wish to be sure that the program will never accidentally over- 
write critical system data, thereby causing a crash; this would be desirable in, 
say, fiight control software or active messages in network routers. We may wish 
to know that the program does not access memory allocated to other processes 
ruiming in the same physical address space; this would be an important con- 
sideration in smart cards. We may wish to deny all disk I/O; this is currently 
part of the default safety pohcy for Java applets downloaded off the net. This 
restriction is rather strong, and we may wish to weaken it to allow restricted 
forms of disk I/O. For example, we may wish to allow the applet to read disk 
files provided it does not send any messages out on the net afterwards, or we 
may wish to let it deposit a limited amount of data of a certain form (cookies) 
in a particular directory. 

At the very mi nimum , any safety pohcy for untrusted machine code executing 
locally should guarantee the following fundamental safety properties. 

- Control flow safety. The program should never execute a jump or call to a 
random location, but only addresses within its own code segment containing 
valid instructions. All calls should be to valid function entry points and all 
returns to the location from which the function was called. 

- Memory safety. The program should not access random places in memory, 
but only valid locations in its own static data segment, five system heap 
memory explicitly allocated to it, and vahd stack frames. 

- Stack safety. For stack-based runtime mchitectiures, the runtime stack should 
be preserved across function calls. We interpret this flexibly; minor modifi- 
cations near the top of the stack are allowed, as is tail recursion elimination. 

These three considerations tmn out to be interdependent. The level of security 
they mutually represent is evidently the minimum nontrivial level of safety one 
could expect in the sense that it is hcird to imagine a meaningful secmity policy 
that would be enforceable without them. More complicated policies are certainly 
possible depending on the application at hand. 

Many papers consider type safety as well, however this makes sense only in the 
presence of a typing discipline. A typing discipline is a way of assigning intention 
to raw data and code. The type of a function usually gives a relationship between 
the input and output states in terms of this intention. For example, we might 
have a function of type ri ; int x r2 : int —> r^ : int, which says that if registers 
ri and r2 contain integer values when the code is called, then upon return, 
register rs must contain a valid integer. In the presence of types, control flow 
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safety, memory safety, and stack safety are subsumed by type safety, since these 
properties are encoded in the typing discipline. This is the approach of TAL (see 
Section 4.3 below). 

2.2 Trust 

Security is based on the notion of trust. In reahty, there may be different degrees 
of trust, but for the sake of simpUcity we partition the universe into two classes, 
those agents and artifacts that are trusted and those that are not, separated by 
an imaginary trust boundary. All trusted software — that is, all software on our 
side of the trust boundary — is called our trusted code base. 

All software security mechanisms depend on some trusted code. We can as- 
sume that the trusted code base includes at least the local operating system 
kernel and some programming language runtime support (provided they have 
not been corrupted; part of the security problem is to prevent this from happen- 
ing). However, it is generally desirable to keep the trusted code base as small as 
possible, simply because the less we need to trust, the less vulnerable we are. 

Control flow safety, memory Scifety, and stack safety can be guaranteed by 
writing the program in a type-safe language and compiling with a trusted com- 
piler. The disadvantage of this solution is that the compiler must be part of 
the trusted code base. Either you must send me your somrce code so that I can 
compile it locally, which forces you to release proprietary source code and forces 
me to spend time compiling it, or I must trust your compiler and the channel by 
which you ship the object code to me. Both are unsatisfactory, the former from 
a performance standpoint and the latter from a security standpoint. Many of 
the approaches described below, including the language-based approach, achieve 
their objectives without assuming that the compiler is part of the trusted code 
base. 



2.3 Performance vs. Safety 

It is of course nice to have both strong safety guarantees emd good performance, 
but these are often in conflict: the latter prefers to allow, the former to restrict. 
In a sense, much of the current reseeirch in security is concerned with resolving 
the tension between these opposing forces in some acceptable way. Different 
approaches to the security problem fcdl on different points of the spectrum, and 
it is perhaps unreasonable to expect a single mechanism to be optimal on both 
counts. This is true also for language-based mechanisms. However, language- 
based approaches can give improvements on both counts, as described in Section 
4 below. 

3 Traditional Approaches 

Traditional approaches to the secmity problem include kernel as reference mon- 
itor, cryptography, code instrumentation, and trusted compilation. These mech- 
anisms offer a fixed set of basic safety policies with httle flexibility. Security 
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automata and PCC are more recent developments that provide a general frame- 
work for expressing and enforcing a wide range of security policies. TAL and 
ECC share the same goals, but sacrifice expressiveness for efficiency. 

Kernel as reference monitor is probably the oldest and most widespread 
security mechanism used in software systems. It refers to the practice of isolating 
operations on critical system components and data in a system kernel. The kernel 
is a privileged body of code that may access these critical components and data 
directly. All other processes may only access them in limited ways using the 
kernel as proxy, communicating their desires by message. This not only prevents 
untrusted code from corrupting the system, but also allows the kernel to monitor 
all access, perform authentication, or enforce other safety policies. 

However, allowing non-kernel processes direct access to critical system com- 
ponents and data can improve performance significantly. With kernel calls, access 
is limited to a few high-level abstract operations provided by the kernel interface; 
but with direct access, more sophisticated algorithms that exploit properties of 
the low-level data structures can be used. Also, kernel calls typically involve 
some overhead for packaging parameters and for saving and restoring registers 
(called a context switch), which can be circumvented with direct access. It is 
therefore desirable to figme out how to allow non-kernel processes more direct 
access to critical system components and data without compromising security. 
The SPIN system [2] is one effort in this direction. The SPIN system enforces 
security by using a trusted compiler. 

Cryptography can discourage access to sensitive data during transit across an 
untrusted network and can be used for authentication. Unfortunately, the safety 
of cmrent cryptographic protocols depends on unproven complexity-theoretic 
assumptions. Current standards such as the Digital Encryption Standard (DES) 
can be broken by an agent with sufficient computing power [5]. To make mat- 
ters worse, we are not completely free to use it; cmrent policy regarding the 
commercial use of strong cryptography is hopelessly entangled in a web of polit- 
ical and legal complications [5]. Fin^llly, cryptography alone cannot ensure that 
downloaded code is safe to run, only that it came from a particular source and 
that it has not been compromised in transit. 

Code instrumentation refers to the process of altering (instrumenting) ma- 
chine code so that critical operations Ccm be monitored during execution. This is 
done in such a way that (i) the functional behavior of the instrumented code is 
the same as the original uninstrmnented code, provided the original code would 
not have violated the safety policy; and (ii) if the original uninstrumented code 
would have violated the safety policy, then at the time of the violation, the in- 
strumented code either detects the violation and causes the system to intercept 
control and shut down the errant process, or otherwise prevents the transgression 
from having any ill effects on the rest of the system. 

An example of code instrumentation is software fault isolation (SFI) or sand- 
boxing [32]. In one particularly simple and efficient variant of SFI, the untrusted 
code and data are loaded into a block of contiguous memory with addresses 
in the range [c2*',c2*' -|- 2** — 1] for some integers c and k (the “sandbox”) and 
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then linked. Then a pass is made over the code, replacing the higher order 
wordlength — k bits of all direct memory access and jump addresses with the 
bits of c. For indirect addresses, code is inserted to do this operation at runtime. 
This has no effect on instructions that tmget addresses inside the sandbox, so 
a correct program will not be affected. However, addresses outside the sandbox 
get mapped to addresses inside the sandbox. The addresses they get mapped to 
axe random from the point of view of the program, which of course breaks the 
program, but the error is confined to the sandbox and cannot compromise the 
rest of the system. 

Schneider [28,29] extends this idea to handle any safety policy that can 
be expressed by a finite-state automaton. For example, one can express the 
condition, “No message is ever sent out on the net after a disk read,” with a 
two-state automaton. These automata are called security automata. The code 
is instrumented so that every instruction that could potentially affect the state 
of the security automaton is preceded by a call to the automaton. Security 
automata give considerable flexibility in the specification of safety policies and 
allow the construction of specialized pohcies tailored to a consumer’s particular 
needs. The main drawback is that some runtime overhead is incurred for the 
runtime calls to simulate the automaton. 

An advantage of code instr um entation is that it can be performed in isolation 
by the consumer with no particular assumptions or extra information about the 
code. However, enforcing safety policies for arbitrary code by instrumentation 
alone can be costly. A runtime check is required before every sensitive opera- 
tion, which could contribute substantially to runtime overhead. Some runtime 
checks can be eliminated if program analysis determines that they are unnec- 
essary, but this is also costly undertaking and could contribute substantially to 
loadtime overhead. Moreover, even the most sophisticated analysis techniques 
are necessarily incomplete, because safety properties are imdecidable in general. 

There is recent evidence that code instrumentation can be used in conjunction 
with language-based methods to improve performance [6,33]. 

4 Language-Based Security 

Schneider defines language-based security very broadly as “a set of techniques 
based on programming language theory and implementation, including seman- 
tics, types, optimization, and verification, brought to bear on the security ques- 
tion.” By that definition, SFI and SASI cue instances of language-based security. 
For the purposes of this paper, however, we would like to focus on a more specific 
model. 

Compilers for high-level programming languages typically accumulate much 
information about a program during the compilation process. This information 
may take the form of type information or other constraints on the values of vari- 
ables, structmral information, or naming information. This information may be 
obtained through parsing or program analysis and may be used to perform opti- 
mizations or to check type correctness. After a successful compilation, compilers 
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traditionally throw this extra information away, leaving a monolithic sequence 
of instructions with no apparent structure or discernable properties. 

However, some of this extra information may have implications regarding the 
safety of the compiled object code. For example, programs written in type-safe 
languages must typecheck successfully before they will compile, and assuming 
that the compiler is correct, any object code compiled from a successfully type- 
checked source program should be memory-safe. If a code consiuner only had 
access to the extra information known to the compiler when the program was 
compiled, it might be easier to determine whether the downloaded object code 
is safe to run. 

We will use the phrase language-based security to refer to the idea of retaining 
this extra information from a program written in a high-level language in the 
object code compiled from it. This extra information — call it a certificate — is 
created at compile time and packaged with the object code. When the code is 
downloaded, the certificate is downloaded along with it. The consumer can then 
run a verifier, which inspects the code and the certificate to verify comphance 
with a safety policy. If it passes the test, then the code is safe to rim. The 
verifier is part of the consumer’s trusted code base; the compiler, the compiled 
code, and the certificate need not be. Figure 1 illustrates a simplified version of 
this framework. 




Fig. 1. Language-Beised Security (Simplified View) 



The key benefit of this approach is that the onus of ensuring compliance 
with the desired safety policy is shifted from the consumer to the suppher. The 
supplier must provide a certificate that gives sufficient information to verify that 
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the object code meets the security pohcy. The consumer’s task is thus reduced 
from the level of proving to the level of checking, a much simpler matter. 

The certificate can take different forms. With PCC, the certificate is a proof 
in first-order logic of certain verification conditions, and the verification process 
involves checking that the certificate is indeed a valid first-order proof. With 
TAL, the certificate is a type annotation, and the verification process involves 
type checking. With ECC, the certificate is an annotation of the object code 
that indicates the structure and intention of the code along with some basic 
type information. 

What high-level language constructs best translate to useful information in a 
certificate? What security policies can be handled? How can we allow consumers 
to express specialized security policies easily? Can we make certificates concise? 
How efficiently can the supplier construct them and how efficiently can the con- 
sumer verify them? How do we prove that the verification mechanism is correct? 
By now there are a number of related projects that address these questions and 
more. Although the various proposals differ in expressiveness, fiexibility, and ef- 
ficiency, they all share a common goal: to use extra information generated during 
compilation to help make the local execution of untrusted mobile code safe and 
efficient. 



4.1 Java 

Perhaps the first large-scale practical instance of the language-based approach 
was the Java programming language [12]. Java contains a language-based mech- 
anism designed to protect against malicious applets. The Java runtime environ- 
ment contains a bytecode verifier that is supposed to ensure the basic properties 
of memory, control flow, and type safety. There is also a trusted security manager 
that enforces higher-level safety policies such as restricted disk I/O. 

The Java compiler produces platform-independent virtual machine instruc- 
tions or bytecode that can be verified by the consumer before execution. The 
bytecode is then either interpreted by a Java virtual machine (VM) interpreter 
or fmrther compiled down to native code. 

Early versions of Java contained a number of highly publicized secmrity gaps 
[4]. For example, one problem was a subtle defect in the Java type system that 
allowed a partially instantiated class loader to be created tmder the control of 
an applet. It was then possible for the applet to use this class loader to load, 
say, a malicious security manager that would permit unlimited disk access. 

According to some authors [4, 13], these problems were ultimately due to a 
lack of an adequate semantic model for Java. Steps to remedy this situation have 
since been taken [1,27]. Nevertheless, despite these initial failings, the basic ap- 
proach constituted a significant step forward in practical programm i ng language 
security. It not only pointed the way toward a simple and effective means of 
providing a basic level of security, but also helped to galvanize the attention of 
the programming language and verification community on critical security issues 
engendered by the rise of the Internet. 
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One disadvantage to the Java system is that the machine-independent byte- 
code that is produced by the Java compiler is still quite high-level. After down- 
loading, it must either be interpreted by a Java VM interpreter or compiled to 
native code by a just-in-time (JIT) compiler. Either way, a rimtime penalty is 
incurred. If the safety certificate represented in the bytecode were mapped down 
to the level of native code by a back-end Java VM compiler, then the same degree 
of safety could be ensured without the runtime penalty, because the back-end 
compilation could be done by the code supplier before downloading. This would 
trade the platform independence of Java VM for the efiiciency of native code. 

4.2 Proof Carrying Code (PCC) 

Proof carrying code (PCC) [21-26] refers to a methodology for allowing formal 
proofs of general safety properties to be produced and verified before the code 
is run. The safety conditions are expressed in first-order logic augmented with 
symbols for various language and machine constructs. The verication process 
involves a proof generation step on the part of the software suppher and a proof 
checking step on the part of the software consumer. 

The most general version of PCC is somewhat more complicated than indi- 
cated in Figure 1, involving a two-phase interaction between the supplier and 
the consumer. In the first phase of the PCC protocol, the supplier produces 
a program consisting of annotated object code and sends it to the consumer. 
The annotation consists of loop invariants and function pre- and postconditions. 
These annotations make subsequent phases of the protocol easier. The consumer, 
who has a particular safety policy in mind, generates from the annotated code 
a verification condition, a logical formula that imphes that the program satisfies 
the safety policy, and sends it back to the supplier. A proof of the verification 
condition constitutes a proof of safety of the program with respect to the con- 
sumer’s safety policy. The supplier then runs a theorem prover, which produces 
a proof of the verification condition, then sends the proof back to the consumer. 
The consumer then runs a proof checker to check that the proof is valid. 

The initial annotation of the code is produced by a certifying compiler. The 
compiler uses information from the program source and program analysis dur- 
ing compilation to construct loop invciriants. This process is mostly automatic, 
but sometimes human intervention is required, depending on the complexity of 
the security policy. Also included in the initial annotation are pre- and post- 
conditions of functions. The precondition of a function should provide enough 
information to allow a verification condition to be constructed for that function 
in the next phase. The verification condition implies that the function satisfies 
the security policy and satisfies its postcondition. 

The Touchstone compiler [22] is a certifying compiler for a type-safe sub- 
set of C that implements this phase of PCC. In addition to the object code, it 
provides information sufficient for constructing a verification condition for type 
safety. One of the major strengths of the Touchstone compiler is that it admits 
many common optimizations such as dead code elimination, common subex- 
pression elimination, copy propagation, instruction scheduling, register alloca- 
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tion, loop invariant hoisting, redundant code elimination, and the ehmination of 
array bounds checks. 

In the next phase of the PCC protocol, the consumer produces from the 
code and certificate provided by the code suppUer a statement in first-order 
logic called the verification condition. This task is performed by the verification 
condition generator (VCGen). The consumer’s security pohcy is defined by the 
action of the VCGen component; that is, the verification condition that VC- 
Gen generates is, by definition, the formal statement of the secmity policy as 
instantiated for that particular program. It would be difiicult— and for practical 
purposes, unnecessary— to express the security policy formally independent of 
any particular program. Part of the definition can be understood in terms of the 
action precondition of each individual operation, which is a formal statement 
of what it means for that action to be safe locally. However, the safety policy 
can be more than just the accumulation of all local action preconditions. The 
verification condition is retmrned to the software suppher. 

The next phase of the protocol involves proving the verification condition. 
At this point the protocol works entirely in the framework of first-order logic, 
independent of the original program or programming language. The software 
supplier constructs a formal proof of the verification condition and returns it to 
the consumer for checking. The verifier checks that the proof is indeed a valid 
proof of the verification condition constructed in the previous phase. 

In the PCC implementation [22], the verification condition and its proof are 
encoded using the Edinburgh Logical FVamework (LF) [8]. The theorem prover is 
based on the Nelson-Oppen theorem prover architecture for combined theories. 
Key tools are congruence closure for dealing with equality and linear simplex 
for dealing with arithmetic. The latter is important in eliminating array bounds 
checks. 

Necula’s PhD thesis [22] describes extensive experiments with PCC giving 
results on running times and code and proof sizes for various benchmarks. 

The advantages of the PCC approach are its expressiveness and the abil- 
ity to handle code optimizations. In principle, any security policy that can be 
constructed by a verification condition generator and expressed as a first-order 
verification condition can be handled. The main disadvantages are that it is 
a two-phase protocol, that it involves weighty machinery such as a full-fiedged 
first-order theorem prover and proof checker, Emd that proof sizes are quite large, 
roughly 2.5 times the size of the object code for type safety and even larger sizes 
for more complicated safety policies. This makes PCC appropriate for applica- 
tions requiring fast optimized code that will be verified once but run many times, 
such as extensions to extensible system kernels, but less attractive for run-once 
applications such as applets and active messages. 

4.3 Typed Assembly Language (TAL) 

Typed assembly language (TAL) [7, 13, 14, 16] is a language-based system in 
which type information firom a strongly-t 5 q)ed high-level language is carried 
down during the compilation process through a series of transformations through 
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a platform-independent typed intermediate language (TIL) [15, 31] and finally 
down to the level of the object code itself. The result is a type annotation of 
the object code that can be checked by an ordinary type checker. One can view 
TAL as a form of proof-carrying code (PCC) in the sense that a complete type 
aimotation is essentially a proof of type safety. In this view, a type checker is 
essentially a proof checker. 

TAL is not as expressive as PCC, but it can handle any security pohcy that 
can be expressed in terms of the type system. This includes memory, control 
fiow, and type safety, among others. TAL is also robust with respect to compiler 
optimizations, since type annotations can be transformed along with the code. 

The original version of TAL [16] was rather abstract, compiling down from 
an polymorphically-typed abstract ML-like language to an idealized RISC-like 
assembly language. Function call linkages were encoded using continuation pass- 
ing semantics. This version of TAL already distinguished between initialized and 
uninitialized data, so that allocation of memory and its initialization need not 
occm atomically. Deallocation of memory is assumed to be handled by a trusted 
garbage collector, although some work has been done toward relaxing this as- 
sumption [3]. The syntax of the original version of TAL is given in Figure 2. 



types 


T 


:= 


a|int|V[A].r|(rr,...,rr)| 


3a. T 


initialization flags 


V? 


:= 


Ojl 




label assignments 




:= 


{t\ : Ti,... ,tn : T„) 




type assignments 


A 


:= 


• 1 a, A 




register assignments 


r 


:= 


{ri : Ti,... ,r„ : r„} 




registers 


r 


: = 


n [ • • • 1 Dt 




word values 


w 


:= 


f 1 i 1 ?r 1 w;[r] j pack [t, w] as r' 




small values 


V 


:= 


r 1 w 1 u[t] I pack [t, v] eis t' 




heap values 


h 


;= 


(wi,... ,Wn) 1 code[A]r./ 




heaps 


H 


:= 


{il hi, ... ,fn 1-4 h„} 




register files 


R 


:= 


{ri 1-4 wi,... ,r„ H4 tx;„} 




instructions 


i 


:= 


aop rd,r,,v \ bop r, v j Id rd,ra(t 


) 1 malloc t[t 








1 mov rd,v 1 st rd{i),Ts \ unpack 


a,rd],v 


arithmetic ops 


aop 


:= 


add 1 sub | mul 




branch ops 


bop 


:= 


beq 1 bneq ] bgt ] bit j bgte | bite 




instruction sequences 


I 


:= 


i.; 1 1 jmp V 1 halt [t] 




programs 


P 


:= 


{H,R,I) 





Fig. 2. Syntax of TAL [16] 



In order to conform more directly to stack-based runtime axchitectmres, TAL 
has been extended to include stcick types [14]. The syntax of this extension is 
given in Figme 3. 

Other extensions of the TAL approach include type support for modules and 
static finking [7], eliminating array bounds checks [34], and runtime code gener- 




Language-Based Security 295 



types 


r 


— ... 


1 ns 






stack types 


a 


= p 1 


nil 1 T : 


<T 




type assignments 


A 


= ... 


I P,4l 






register sissignments 


r 


= {n 




jTn • TVi) Sp : O' } 


word values 


w 


= ... 


1 1 


DS 




small values 


V 


= ... 


1 






register files 


R 


= {>'1 


l->- Wl, 




,r„ u)„,sp !->■ 5} 


instructions 


i 


= ... 


1 salloc 


n| 


sfree n j sld rd,sp(i) | sst sp(i),rj 


stacks 


S 


= nil 


w :: S 







Fig. 3. Extension of TAL to accommodate stacks [14] 



ation [9]. A realistic version of TAL for the x86 architecture called TALx86 has 
been developed, along with a prototype compiler for a type-safe C-like language 
called Popcorn [13]. 

4.4 Efficient Code Certification (ECC) 

The author’s project on efficient code certification (ECC) [11] was conceived as a 
way to improve the runtime efficiency of small, untrusted, run-once applications 
such as applets and active messages while still ensuring safe execution. “Run- 
once” means that the cost of verification cannot be amortized over the lifetime 
of the code, so certificates should be as concise and easy to verify as possible. 

ECC attempts to identify the minimum information necessairy to ensme a 
basic but nontrivicJ level of code safety, including control flow, memory, and 
stack safety, and to encapsulate this information in a succinct certiflcate that is 
easy to produce and to verify. The level of safety currently provided by the ECC 
prototype is roughly comparable to that provided by Java bytecode verification; 
but unlike bytecode, it operates at the level of native code, thus avoiding the 
runtime overhead of bytecode interpretation or just-in-time compilation. The 
prototype implementation compiles Scheme to executable x86 machine code. 

The system does not rely on general theorem provers or typing mechanisms. 
Although less flexible than PCC or TAL, certificates are compact and easy to 
produce and to verify. The certificate csm be produced by the code supplier 
during the code generation phase of compilation and verified by the consumer 
at load time; both operations are automatic and invisible to both parties. 

Although inspired by PCC, it would be inaccurate to call ECC certificates 
proofs, because they cure not proofs in ciny formal system. The certificate consists 
of annotations that provide information about the structure and intention of 
the code, as well as some basic typing information. This information is derived 
from the high-level program during compilation. Guided by this information, 
the verifier checks a set of simple static conditions that inductively imply the 
desired safety properties. 

Drawbacks to ECC include platform-dependence and fragility with respect to 
fancy compiler optimizations. Simple local optimizations such as tail recursion 
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elimination can be handled. Preliminary experiments indicate that the sizes of 
ECC certificates range from 6% to 25% of the size of the object code. This seems 
to indicate a substantial improvement over PCC, although a fair comparison 
would require a more careful analysis to take all variables into account. 

The main reason for the savings in certificate size in ECC over PCC or TAL 
is that ECC makes heavy use of compiler conventions. This is both an advantage 
and a disadvantage. The advantage is that it allows information that must be 
included explicitly in a PCC or TAL certificate to be omitted from an ECC 
certificate. The disadvantage is that it makes the verifier heavily dependent on 
the compiler implementation. 

For example, suppose subroutines always return their result in register r. 
The certificate does not need to say this, but only indicate where the subroutine 
linkages are. The verifier, knowing about this convention and knowing from the 
annotation that a certain piece of code is an instance of a standard subroutine 
linkage, has only to check that the correct subroutine linkage code is there. It may 
then proceed under the assumption that the result is in register r. Subroutine 
linkage code generated by the compiler will be fairly uniform — a simple function 
of the number and type of arguments, modulo work register names — so the check 
can be done by table lookup with unification on register names. All the certificate 
has to do is indicate the intention of the code. 

The verification process in ECC is very efficient. It is linear time except for 
a sorting step to sort jump destinations, but since almost all jumps are forward 
and local, a simple insertion sort suffices. 

4.5 Information Flow 

Language-based methods can be used to control information flow among mutu- 
ally distrustful agents [17-19]. This is similcir to other forms of safety described 
in the previous section, except that the security policies are based on a model 
of information flow. The policy is specified by the user by means of einnotations 
in the high-level language that hmit how information can flow in a program and 
between programs. Annotated programs are then checked at compile time to 
ensme that they conform to the flow rules. 

Currently, a prototype implementation called JFlow has been created that 
augments Java with information flow primitives [17]. The JFlow compiler is im- 
plemented as a source-to-source translator that checks information flow safety 
using a type-checking mechanism, then discards the annotations, emitting ordi- 
nary Java. If the control information were passed down to the object code, then 
downloaded object code could be verified in a manner similar to TAL or ECC 
before running to ensure that it does not leak information. 
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Abstract. This paper is an attempt to apply domain-theoretic ideas to 
a new area, viz. knowledge representation. We present an algebraic model 
of a belief system. The model consists of an information domain of special 
kind (belief algebra) and a binary relation on it (entailment). It is shown 
by examples that several natural belief algebras are, essentially, algebras 
of flat records. With an eye on this, we characterise those domains and 
belief algebras that are isomorphic to domains or algebras of records. 
For illustration, we suggest a system of axioms for revision in such a 
model and describe an explicit construction of what could be called a 
maxichoise revision. 



1 Introduction 

In general, a belief may be considered as a more or less complete description of 
some unknown possible world, where the term ‘description’ can be given a lot 
of various meanings. In non-numerical formalisms, knowledge most frequently is 
represented by logical formulas or sets of formulas. Our starting point here is the 
following thesis: the set of beliefs of an agent is a kind of information domain. 

Theory of domains is recognized to be a basis of the denotational semantics 
of programming languages. Recently, domains have been used as data models 
for complex database objects which ciUow to generalise many of the basic results 
of relational databases [17,4,13]. We carry on the trend and propose below a 
domain-based formalism as a conceptual apparatus for representing beliefs and 
belief change operations. Namely , we present an algebraic model of a belief sys- 
tem of an imagined agent, and cite, without proof, a number of results describing 
its structure; some of them may be of independent interest in domain theory. We 
also briefly illustrate use of the model: in the last section a revision-like belief 
operation is constructed. Regretfully, for lack of space an in-depth comparison 
of om approach with traditional ones is left to another paper. 

Today, the theory and practice of knowledge bases is mainly a part of applied 
logic. A motivation for our approach is to And out whether, and which way, 
the existing algebraic data models in databases can be adopted for knowledge 
representation and modelling knowledge change operations. Another field where 
such an algebraic models could be fruitful is reasoning about state changes in 
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various systems of dynamic database logic [20, 7, 19]. In the bulk paper [19], both 
declarative and operational semantics for updates are presented; denotational 
semantics is still to be developed. 

The idea of solving problems in logic by translating them into an algebraic 
language and then using the powerful methods of universal algebra for solving 
them is the essence of algebraic logic. Methodology of the so-called general al- 
gebraic logic (see, e.g., [16]) has influenced the present work. In particular, the 
proposed model of a belief system, considered together with what we call below 
a possible world semantics for it, resembles some basic constructs in that held. 
Our aim is to initiate similar research in the mentioned related areas. 



2 Domains: Some Preliminaries 

We overwiev in this section a few special kinds of domains and constructions in 
them. 

In [17, Def. 3.2], A. Ohori calls a (description) domain any poset which (1) 
has the bottom, (2) has the pairwise bounded join (p.b.j.) property: every pair 
of elements bounded from above has the least upper bound, and (3) is effective 
in a natural sense. He also argues there, why such domains are more appropriate 
for modeling database objects than the so called Scott domains which are in 
use in programming theory. Such a weakened notion of a domain is useful also 
in omr discussion as the starting point. Moreover, as we are not going to deal 
with effectiveness matters here, we shall drop (3) and caU a domain any poset 
satisfying (1) and (2). 

As usual, -L stands for the bottom element of a domain. We call elements a 
and 6 of a domain compatible, and write a 6, if the join a V 6 exists. A subset 
A of a domain is said to be compatible if any two elements of it are. A nonempty 
subset J of a domain D is an ideal if it is downward closed and if a V 6 S J 
whenever a,b £ J and a i 6. For instcmce, the subset j-a := {x E x < a} is an 
ideal, called the principal ideal generated by o. The ideal J is said to be strong 
[4, p. 33] if it is closed under all existing joins. 

We shall need some more structure on a domain. A domain is said to be 
multiplicative if every two elements a and b of it have the meet a A 6. In this 
case, every principal ideal of the domain is a lattice; the converse, however, need 
not hold true. A multiplicative domeiin D is distributive if every sublattice j,a 
is distributive. We call a domain semi-boolean if every principal ideal of it is a 
boolean lattice. Subtraction can naturally be defined on a semi-boolean domain 
by a — 6 := a A (a A b)'^, where (x)[, stands for the complement of x in j,a. 

A domain D is complete if it is a CPO, i.e, if every chain in it has the 
least upper bmmd. In this case it is also bounded complete: every subset of D 
bounded from above has the join and, hence, every non-empty subset has the 
meet; in particular, a bounded complete domain is multiplicative. The domains 
considered in [4, 13] are complete and distributive. 
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Example 1. A tolerance space^ is a set A' considered together with a reflexive 
and symmetric relation ^ on it. Given such a space, and arbitrary subsets X and 
Y of it, write x i K for (Vy &Y)x ^y, and X i K for (Vx € X) x ^ y. 

Now let V := {X C X ■. X ^ X}. Then X UY belongs to V together with 
X and y iff X i y; moreover, D is downward closed. Any subset D oi V 
containing the empty set and closed under existing unions is a domain. We shall 
call domains of this kind tolerance domains. If the domain D is closed also under 
intersections, then it is distributive, and if D is closed under set subtraction, then 
it is semi-boolean. For instance, the subset X>gn consisting of the flnite members 
of P is a semi-boolean domain. The full domain P itself is complete. O 

Theorem 1. Every distributive domain is isomorphic to a tolerance domain. 

This is a natural generalisation of the Birkhoff-Stone representation theo- 
rem for distributive lattices ([12, Theorem II.1.19]; see also [9, Theorem 10.3]). 
Any domain of flat records is, in fact, a tolerance domain. Indeed, suppose 
that (Fj;: x 6 L) is a family of non-empty sets. A record is defined to be a 
partial fimction f on L such that /(x) € Vx for every x € dom/. Now put 
£ {(x,i;): x € X and v € V^} and let (x, u) i (y,v) mean ‘either x ^ y or 

u = v {or both)’. Then the set H of all records coincides with the domain P (as 
it was defined above). A record domain is any subdomain of 7^. 

Given elements a and 6 of a domain D, we say that o is overridden by b (in 
symbols, a C 6) if x < a and x <[, 6 imply x < 6. In a record domain, f C. g 
iff dom / C dom 5 . Generally, the relation C feiils to be transitive: this property 
is characteristic of record domains. For example, a tolerance domain with IZ 
transitive is easily shown to be isomorphic to a record domain. 

3 Belief Systems: an Intuitive Background 

Let us discuss informally the system of beliefs of an imagined agent A. A few 
illustrations to this discussion will be presented at the end of the section. 

Consider the set I of “all possible” pieces of information accessible to A. We 
assume that the agent regmds some pieces as parts of, or included in, others, 
i.e. that there is a certain order relation < on I. We also assume that the agent 
admits as (actual or potential) beliefs only those pieces that he considers as 
coherent. In particular, our agent might not be ready to fit together any two 
beliefs. Presumably, A considers two behefs as compatible in this sense if there 
is a belief that includes both of them. At last, if A is not a nominalist, he may 
find it convenient to admit the “empty” behef containing no information. 

Let B stand for the partially ordered set of all beliefs of A. The above con- 
siderations suggest that it should be a domain. Certainly, B would have a more 
specific structure; we shall return to this point later. 

^ We owe to an anonymous referee the note that this notion is similar to that of 
Gireird’s coherence space (which eirised in semantics of linear logic), as well as the 
recent reference [2] on this subject. 
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Some beliefs of A may still be inconsistent in some “objective” sense, but 
we also suppose that the agent is rational enough to accept all the consistent 
pieces of information as coherent. A consistent piece could be thought as one 
that obeys certain integrity constraints. Note that, from this point of view, a 
constraint itself need not appear explicitly as a piece of information. 

Furthermore, a piece of information not only has parts, but usually also 
entails “new” pieces of information as well. Then, for excimple, a behef would 
quahfy as consistent if any two beliefs it entails are compatible, and as paracon- 
sistent if it does not entail all beliefs. Naturally, entaihnent relation should be a 
preorder on the set of beliefs; moreover, a belief should entail its parts. 

Genuine behef change is intimately connected with the notion of consistence. 
However, the agent’s views on the structure of his beliefs, as well as some ele- 
mentary operations of recording and manipulating (pieces of) information should 
be independent of the latter. We shall mention three structural operations on 
beliefs. 

One of these is join, as we have already made the assumption that B should 
be a domain. We also assume that the agent is able to pick out the common part 
of two pieces of information, i.e. that the domain B is multiplicative. So, we are 
faced with two operations, V and A, on B. 

The third operation <, which we call overriding, is a rather special kind 
of a belief change operation often called revision [1,10]. Suppose that o is a 
certain “actual” belief, or a “cognitive state” , of A and that one more coherent 
piece of information, 6, is in question. The belief h being, generally, incompatible 
with a, the agent must delete something from a and join b to the reminder if 
he wants both to accept h and to keep the resulting information coherent. The 
“forgotten” part of a is to be minimal in the sense that all what is compatible 
with b is retained. On the other hand, nothing not contained in b should emerge 
in the result. 

The resulting structure — the set of beliefs together with the three structural 
belief operations is the intuitive prototype of the belief algebras considered below. 
Of course, V is merely a restriction of <; however, for technical reasons it is 
useful to treat them as separate operations. Now, the belief system of the agent 
A consists of a belief domain and a fixed entailment on the latter. We do not 
include essential (i.e. non-structural) belief change operations in the model of a 
belief system: the agent may prefer different operations at different occasions, or 
due to different suppositions about the outer world, e.c. (But we guess that — as 
far as cognitive states of A are not required to be closed under entailment — the 
operation < is the most liberal revision: the inclusion a<*b < a<b should hold 
for any reasonable revision-like operation <*.) 

Example 2. A good illustration to the considerations that led us to the above 
idea of a belief system is provided by any tolerance space, the tolerance relation 
itself being interpreted as a kind of compatibility. The role of I is played, in the 

^ We borrow the term from [6j. As a matter of fact, the operation from [6] is a restric- 
tion of ours one to “atomary” pieces of information. The so called peissive updating 
[19] is another relative of overriding. 
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notation of Example 1, by the powerset of X, and we can tcike the full domain 
V for B. (A variant: I may be the set of finite subsets of X.) The first two behef 
operations axe union and intersection, respectively, while overriding is defined in 
a natural way as follows: X <Y :={xeXuy:a:(!,y'} = {xeAr:a:(!,y}uy. 
Clearly, not every tolerance domain is closed under the new operation c. 

In particular, in a record domain 'TJ / < 5 is the record h such that dom h — 
dom/ U dom^ and h{x) — i.txE dom^ then g{x) else f{x). 

Furthermore, let Cn be a closure operator on B, i.e., an isotonic idempotent 
operation that takes subsets of X from B into subsets in B, and satisfies also 
the condition A C Cn(A). In the spirit of algebraic logic, let us think of Cn as of 
a consequence operation in A": x is said to be a consequence of A (in symbols, 
A h x) iff X G Cn(A). Now, we consider that A entails B iff B C Cn(A). In 
particular, if A h x always implies that x d, A, then the triple {X,T>f^^,\~) is 
an example of D. Scott’s information system (see, e.g., item 3.32 in [9] for the 
definition). □ 

The standard logical viewpoint to behef systems fits in this scheme if we 
consider sets of formulas rather than individual formulas as beliefs. 

Example 3. Let Frm be the set of formulas of some propositional language. 
Take sets of formulas for the pieces of information. The agent may want to avoid 
such “visually inconsistent” pieces of information as {ip, -'p}, and reckon a set of 
formulas coherent if it contains no formula together with its negation. This way, 
one comes to the full tolerance domain based on the space {Frm, d>), where p ^ ip 
iff none of the formulas p and ip is the negate of the other. Furthermore, let F 
entail C? iff F logically implies every formula from G. Note that F is consistent 
iff the set of formulas F entails is coherent, indeed. 

The so caUed AGM postulates for rational belief revision [1,11] deal with 
closed sets of formulas as cognitive states, and involve the single inconsistent 
closed set, i.e. the whole set of formulas. In our setting, this set, being incoherent, 
is ruled out. □ 

This view to propositional beliefs can be further developed in various ways. 
For instance, if the agent does not distinguish between such formulas as p^ ip 
and Ip ^p, OT pV p and p, then the set (in fact, the algebra) of formulas, Frm, 
must be factorised modulo an appropriate congruence relation. 

A somewhat different look on propositional beliefs is both possible and useful. 
Namely, we can present every propositionad belief as a flat record. 

Example 4- Let B consist of all partial functions (a variant: of the functions 
having a finite domain) from Frm to the set 2 := {0, 1}. Each function being 
considered as a set of ordered pairs, B is ordered by set inclusion. The intended 
interpretation of a function f £ B is as follows: the agent sets f{p) == 1 if he 
believes that the formula p is true, and f{p) — 0, if he believes that p is false; 
he leaves f{p) undefined if he has not a definite opinion about the truth of p. 

Join, meet and overriding are now imderstood here as in Example 2. As to 
entailment, let / entail g if the set of formulas {p: f{p) = 1} f{p) — 0} 




304 Janis Clrulis 



logically implies every formula tj} such that g{ip) = 1 as well as every formula 
-'■tp such that g{tp) = 0. □ 

Beliefs based on any many-valued logic can naturally be presented along 
these lines. The so called truth value semantics of first-order predicate logic can 
also be described after the same fashion. This explains our interest in record 
domains. 



4 Formalisation of Intuition: Belief Algebras 

We now reformulate the intuitive views on the structure of the set of beliefs of 
the agent as axioms of belief algebras. 

Definition 1. Suppose that B is a multiplicative domain. An overriding oper- 
ation on B is any binary operation < that satisfies the following axioms: 

<1: b < a<b, 

<2: (a < 6) < (a A (a < 6)) V b, 

<3: if c < a and c (j, 6, then c < a < 6. 

If < is such an operation, the partial algebra {B, V, A, <, ±) is said to be a belief 
algebra. □ 

Every tolerance domain equipped with overriding operation as in Example 2 
is a belief algebra. We do not know any characteristic of the class of those belief 
algebras isomorphic to algebras of this kind. However, the following proposition 
shows that the structme of a belief algebra is alike to that of tolerance domains. 
In particular, the operation < is determined by the axioms <l-<]3 uniquely. 

Proposition 1. Suppose B is a multiplicative domain. The operation < on B 
is overriding iff, for all a,b € B, a<b = mM{x : b < x < (a A x) V 6}. If it is the 
case, then 

{a) a < b iff b < a = b, and anb iff a<b = b, 

(h) a b iff a < b — b < a, and a<b = aV b in that event. 

We further show in what sense overriding obeys the principle of minimal 
change [11]. The following definition is an adaptation of the standard definition 
of the so called lattice betweenness [3, §9]. 

Definition 2. The l-betweenness on a multiplicative domain D is the ternary 
relation /? defined by /3(a, b,c) := a Ac < b < (a A b) V (b A c). □ 

The notation /3(a, b, c) is usually read as ‘6 lies between a and c’. The following 
well-known fact justifies another reading ‘6 is closer to a than c is’. 

Proposition 2. For every a G D, the binary relation oCq defined by b (Xa c := 
(3{a, b, c) is a partial ordering on D with a the bottom element. In particular, ocj_ 
coincides with <. 
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Example 5. In the situation of Exercise 4, cin interpretation is just a belief / 
with dom/ = P, where P is the set of all atomic formulas. Now, 5 oc/ /i for such 
beliefs f,g,h iff Diff{f,g) C Diff(f,h), where Diff{a,b) := {p G P: a{p) 7 ^ 
b{p)}. Orderings of this kind have been used to describe concrete revision/update 
operations (see Sect. 4 of [14] for an overview). □ 

Given a subset A of a multiplicative domain, we denote by min^(A, a) the 
least element of A w.r.t. oco (when it exists). The following theorem says that 
a < 6 is the closest element to a among those including b. Its proof reduces to 
direct calculations. 

Theorem 2. Suppose that b is a multiplicative domain and that < is a binary 
operation on B. Then < is overriding iff, for all a and b from B, a < b = 
min^({x: b < x},a). 

We end this section by a representation theorem for belief algebras of spe- 
cial kind. A record algebra is a multiplicative record domain closed under the 
overriding of Example 2. 

Theorem 3. A semiboolean belief algebra B is isomorphic to a record algebra 
iff the overriding operation of B is associative. 

Surprisingly, this naturally-looking theorem requires involved algebraic tech- 
niques for proving its non-trivial part. We only outline the proof, and note that 
[5] is the standart reference book on general theory of partial algebras. 

First of all, an algebra tmus out to be isomorphic to a record algebra iff it is a 
subdirect product of flat belief algebras (a belief algebras is said to be flat if two 
distinct elements of it are comparable only imder the condition that one of them 
is J-). Furthermore, the class B of all belief algebras with associative overriding 
can be characterised by equations and is, in fact, an E-variety. Then Theorem 
20.3.8 of [5] apphes: it follows that B is the class of all subdirect products of 
its subdirectly irreducible members. At last, an algebra from B is subdirectly 
irreducible iff it is a flat belief algebra. Hence, B is just the class of isomorphic 
copies of record algebras. 

Remark 1. Being multiplicative is, really, not a necessary property of a doman 
in order that an overriding- Ike operation C8in hve on it. Indeed, in a belief algebra 
any subset {x\ x < a and x i 6} has the greatest element which we denote by 
alb, and always alb — (a<6) A6 and a<b — {alb)\/b. Now, if alb exists in some 
domain D for all a and b, the latter identity serves as the definition of <. □ 

5 Logic on Domains 

In this section, we algebradse the notion of entailment and some related ones. 

Definition 3. By an entailment on a domain B we mean any transitive relation 
^ on B such that every subset Ba := {x: a -< x} is a strong ideal of B containing 
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a. We read a as a entails b. The notation a « 6 means that a < b and b ^ a. 
A belief system is a pair (J5, ^), where B is a behef algebra and ^ is entailment 
on B. □ 

Examples 2-4 yield some concrete belief systems and, hence, provide jus- 
tification for the concept. However, the above concept of entailment is more 
hberal than that in classical logic, because it allows for modelling some aspects 
of paraconsistence. 

Assume that (B, -() is a belief system. A subset J of B is said to be a belief 
set if J is a strong ideal of B closed under entailment. Where A” is a subset 
of B, we denote by Bx the intersection of all those belief sets including K. In 
particular, if 6 ^ c, then Bb^c — Bbvc- The element c of B is said to consist with 
b if 



(Vx e Bb,c) ((32/ e Bb,c)x ^y^xeBb) . 

Intuitively, c consists with b if adding c to 6 does not yield “new contradictions” . 
We denote by C{b) the set of elements consisting with b. (Note that such an 
element need not be compatible with b.) It turns out that C{a) = C{b) whenever 
afub. 

The above definition of entailment is formulated merely in terms of the under- 
lying belief domain and can be treated, for this reason, as inner, or syntactical. 
We now adopt a tool widely used in logic to generate outer entailment relations 
semantically. The subsequent definitions are illustrative and will not be used in 
the next section. 

Definition 4. A possible world space for a belief domain B is a peiir [W, |h), 
where W is a nonempty set of possible worlds and |l- is a relation in W x B such 
that every subset B’" := {a : u; |l- a} is a maximal strong ide^ll of B. The possible 
world w is said to be normal if B*" is compatible. □ 

For instance, if M is a set of maximal elements of B, then (M, >) is such a 
space. In the case of propositional beliefs, the set of all interpretations together 
with the ordinary relation of being a model provides an example with all possible 
worlds normal. 

Proposition 3. Given a possible world space (W, |h) for B, let Mod{a) := 
{tc: w |l~a} and a\= b := Mod(a) C Mod{b). Then (B, )=) is a belief system. 

The following definition also roots in logic. 

Definition 5. A possible world semantics for a belief system (B, is any pos- 
sible world space for B such that, in the notation of Proposition 3, a -< 6 implies 
a 1= 6. If the converse also holds, the semantics is said to be adequate. □ 
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6 Belief Revision in a Belief System 

The internal structure of a belief system makes it possible to define various 
belief change operations purely in terms of ingredients of the structure; taking 
into consideration a suitable semantics increases these possibilities, just as for 
propositional beliefs: cf. [14, 15]. As an illustration, we shall briefly discuss some 
concrete syntactic constructions which demonstrate that, and how, familiar ideas 
concerning belief revision can be realised Eilgebraically. For the rest, we assume 
that a complete belief system {B, -<) is fixed. 

Definition 6. Revision on S is a binary operation on B subject to the 
axioms 

o* 1: b < a<* b, 

<* 2; a<*b < a < b, 

<*3: if a € C(b), a <b < a<*b, 

<=K 4: a<*b E C(b), 

<+ 5: if 6 w c, then a<*b ^ a<* c. □ 

These axioms are more or less direct analogues of the so called AGM postu- 
lates (K2)-(K6) for revision, respectively (see [1, 11]; the postulate (Kl) reduces 
to the condition that the operation <* is total). Indeed, both (K2) and (<* 1) 
garantee that the new information by wich we revise the belief state is accepted 
in the result. Further, (<* 2) says that revision is stronger than pure overriding. 
If o i 6 , the right hand side of (<* 2) reduces to o V 6 , and this is essentialy 
the case covered by (K3). The axiom (<*3) is a converse and shows that if the 
present behef state is consistent with the new information, then revision reduces 
to expansion; (K4) usually is interpreted the same way. The meaning of (K5) is 
that the resulting belief set is consistent unless the new information is logically 
impossible; (<* 4) says that the new belief set does not yield more contradictions 
than the new piece of information itself. The last axiom meems that <* is exten- 
sional in its second argument, but we feel that it cannot be the case, in general, 
with the first argument in the presence of (<*2). Likewise, (K6) also is a kind 
of extensionality axiom. 

We now tmrn to explicit modelling of certain revision operations. 

For every a,b E B, we denote by R{a, b) the set {a; E C{b) : b < x < a<b}. 
By (<* !)-(<* 3), the revision of a by 6 always belongs to R{a,b). If R{a,b) is 
finite, or if we assume the Zorn Lemma, then every element of R{a, b) is included 
in a maximal one. Considering the meet of some selected maximal elements of 
R{a, b), we obtain a particular revision operation. This idea is similar to the use 
of selection functions in [1,11] for representing certain contraction operations. 

Given a subset A and an element a of B, let Max<A stand for the set of 
maximal elements of A. It can be shown that, in fact, 

R{a,b) = {x <b: xECa{b)}, Max< il(a, 6) = Max<{x < 6: xECa{b)}, 

where Ca(b) := (4,a)nC'(6). In particulair, always a<*b — a'<b for an appropriate 
a' E Ca{b). 
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Theorem 4. Suppose that a is a mapping that assigns a nonempty subset of 
Mcix<C'a(6) to every pair a,b of elements of B. Define the operation <* by 

a<*b := ^{x <b: x S er{a,b)) . 

Then <* satisfies the axioms (<* l)-(<i*5). 

We Cell! such an operation a partial meet revision. If, moreover, cr(a, b) is a 
singletone, we call the operation a maxichoise revision. (These attributes are 
borrowed from [1,11].) The proof the following natmal characteristic of maxi- 
choise revision is strightforward. 

Theorem 5. The operation <* is a maxichoise revision iff it satisfies the con- 
dition 

<* 6: if c G R{a, b) and c & C(a<* b), then c < a<*b. 

Let Min^(A, a) stand for the set of minimal elements of A w.r.t. oCq (see 
Sect. 4). Our last theorem shows that any maxichoise revision obeys a kind of 
minimal change principle. 

Theorem 6. For every p € B, 

p € Max<il(a, b)if and only if p € Min/ 3 ({x G C{b) : b < a:}, a) . 

7 Conclusion 

We have demonstrated that some essential aspects of the traditional represen- 
tation of knowledge by logiccil means cam be treated in purely algebraic terms. 
General considerations supported by the view on behefs as coherent sets of logi- 
cal formulas have led us to a general algebraiic model of a belief system. It seems 
to be sufficiently powerful: at least some of the ideas related to belief change 
and first developed for propositional behefs can be realized in the model. On 
the other hand, the propositional behefs are more trEictable than the algebraic 
ones from a computational point of view. The question as to in which fields, 
and in what extent, this algebraic approach to knowledge representation may 
be fruitful requires furter investigating. In this respect, W. Pratt’s reflections 
[18, p. 601] on interconnections between dynaimic logic and dynamic algebra are 
stimulating. 
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Abstract. Since the work of Rabin [9], it has been known that any 
monadic second order property of the (labeled) binary tree with successor 
functions (and not the prefix ordering) is a monadic As property. 

In this paper, we show this upper bound is optimal in the sense that 
there is a monadic Es formula, stating the existence of a path where 
a given predicate holds infinitely often, which is not equivalent to any 
monadic II 2 formula. We even show that some monadic second order 
definable properties of the binary tree are not definable by any boolean 
combination of monadic E 2 and IT 2 formulas. 

These results rely in particular on applications of Ehrenfeucht-Frai'sse 
like game techniques to the case of monadic E 2 formulas. 



1 Introduction 

In this paper we are interested in a problem of descriptive complexity, an impor- 
tant and rapidly growing research area in theoretical computer science. Descrip- 
tive complexity was proposed by Fagin in the seventies (see [3]) as an approach to 
fundamental problems of complexity theory such as whether NP equals co—NP. 
While ordinary computational complexity theory is concerned with the amount 
of resources (such as time or space) necessary to solve a given problem, the 
idea of descriptive complexity is to study the expressibility of problems in some 
fixed logical formalism. For instance, in his seminal paper of 1974, Fagin shows 
that NP problems coincide (over finite structures) with the problems expressible 
in existential second order logic. Since then, there has been a large number of 
results in descriptive complexity. We note that most of these results concern fi- 
nite structures (which are those interesting for the applications in computational 
complexity theory), but studying descriptive complexity also over infinite struc- 
tures makes sense and may lead to a better comprehension of the expressiveness 
of various logical systems. 
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Here, we are concerned with the structure of the monadic second order logic 
(MSOL) on the infinite binary tree. It is a well known result of Rabin [9] that 
any monadic second order formula (MS-formula) is equivalent, on the binary 
tree, to a monadic formula, that is a formula of the form 

aXi,... .Xm'iYu--- ,YntP 

where ^ is a first order formula. This result comes from the translation of 
any MS-formula into tree-automata. It is also well-known [9] that there are 
formulas which are not equivalent to any monadic Ei formula (of the form 
- Since many logics of programs [2,12] can be translated into 

MS-formulas on the binary tree, this difference of expressive power between 
monadic E\ and E 2 allows a classification of these logics. 

However, this upper bound obtained by Rabin works in the presence of a 
particular relation symbol, in the vocabulary of the logic considered, whose in- 
terpretation is the prefix ordering over nodes of the binary tree. If instead of 
this ordering we consider the two successor functions over nodes of the binary 
tree, Rabin’s approach shows that every MS-formula is equivalent a monadic Ez 
formula, that is a formula of the form 

,ZpV- 

where tp is first order. 

In this paper, we show that this upper bound is tight in the sense that, over 
the binary tree with successor functions instead of the prefix ordering, there are 
MS-formulas which are not equivalent to any monadic E 2 formula. We even show 
that, again on the binary tree with successor functions, there are MS-formulas 
which are not equivalent to any Boolecui combination of monadic E 2 formulas. 
Notice that in many logics of progrcims the prefix ordering is (in some sense) 
definable (for instance with Kozen’s propositional mu-calculus [7]) so it makes 
sense to be interested in the monadic second order logic of the binary tree with- 
out this ordering predefined. 

A related result is established by Rabin [10]. He investigates there the ex- 
pressive power of some restricted notion of tree-automata : the special or Biichi 
automata. He shows that there are MS-formulas which cannot be translated into 
this particular kind of tree automata. This result is extended by Hafer [5] who 
shows that there are MS-formulas that Ccinnot be translated into any Boolean 
combination of Buchi automata. 

Our result is possibly a consequence of Rabin’s and Hafer’s results provided 
monadic E 2 formulas chareicterize exactly Buchi tree-automata. It is well known 
that Buchi tree-automata can be translated into monadic E 2 formulas over the 
binary tree with successors fimctions. The converse is however an open ques- 
tion 

^ more recent works, yet unublished, suggest unexpectedly that this converse may also 
be true . . . 
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2 Definitions and notations 

Let T = {0, 1}* be the binary tree, i.e. the set of finite words over the alphabet 
{0, 1}, equipped with (immediate) successor functions Tq : T -> T defined by 
ra{w) — w.a for a = 0 or 1. 

The set of formulas of MSOL we consider in the sequel is built from the func- 
tion symbols tq and ri to be interpreted by the corresponding successor func- 
tions, equality predicate. Boolean connectives, existential and imiversal quantifi- 
cations of first order variables x, y, , and existential and universal quantifi- 
cations of (monadic) second order variables X,Y, ... . 

We call monadic Sq and monadic i7o the set of MS-formulas without set 
quantifiers and, for any integer n, we call monadic L'n+i (resp. monadic fln+i) 
the set of MS-formulas formed by a sequence of existential (resp. universal) set 
quantifiers followed by a formula of monadic i7„ (resp. monadic I7„). 

Any MS-formula ,Xm,Xi, - ■ • ,X„), with free first order variables 

among {xi, • • • , Xm} and with free second order variables among {Xi, • • • , X„}, 
defines over the binary tree the set of tuples (ti, • • • ,Rn) G x 

P(T)" that satisfy the formula (p. This is denoted by 

(tl, • • • , tfri, Rl, • • • , Rn) 'P 

(see [11], for instance, for a precise definition of the satisfaction relation). 

Notice that any tuple {Ri,--- ,Rn) G 'P(r)" corresponds to the fimction v 
from T to V{{Xi,--- ,X„}) defined by v{t) = {Xj ; t e Ri}. In the sequel, 
such a function is called a {Xi, • • • , X„}-colored tree (or {Xi, • • • , X„}-tree for 
short). 

Given i? C T a set of nodes of the binary tree, we use the notation v\R 
that stands for the restriction of v to R. Given vi \ R\ R{Ci) and resp. 
V 2 : R 2 ViC^) two partially Ci-colored resp. C 2 -colored trees, we say vi 
and V 2 are compatible when they agree on i?i n i e. for any t 6 i?i n i? 2 > 

vi(t) nCi = V 2 (t) n C 2 . 

Given a partial Ci-tree and a pcirtial C 2 -tree V 2 compatible one with the 
other, we denote by {vi,V 2 ) the pcirtial Ci U C 2 -colored tree (also denoted by 
Cl , C 2 -colored tree) given by the union of vi and V 2 - 

3 An Ehrenfeucht-Fraisse like game 

Checking whether a colored binary tree satisfies some MS-formula can be seen 
as a game between two players : the first one trying to show the binary tree does 
satisfy the property, the second one trying to show the converse. 

In the first order case, these game considerations lead to the standard no- 
tion of Ehrenfeucht-Praisse game (EF-game), a useful tool to show that a given 
property is not definable in first order logic (see for instance [1]). 

Here, we define a second order version of this EF-game in order to handle 
monadic IJ 2 formulas. 
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In order to investigate definability by monadic E2 formulas, we extend EF- 
games to the second order case as follows. This definition can be seen as a 
generalization of the second order game (also called the Ajtai-Fagin game) in- 
troduced to handle monadic NP definability (see [ 4 ] for a discussion on these 
monadic NP games). 

Definition 1 (monadic E2 games). Given three integers c, d and e, given 
Vi and V2 two nonempty disjoint sets of B -colored trees, denoting by C and D 
two disjoint sets of set variables not in B, with c — \C\ and d = \D\, we define 
the game EF2{Vi, V2, c, d, e) as a play in four (second order) rounds between two 
players ( called Spoiler and Duplicator) as follows : 

1 . for each v &Vi, Spoiler chooses a C -coloring Ci(v); 

2 . Duplicator chooses one V2 € V2 and a C -coloring C2(v2); 

3 . Spoiler chooses a D-coloring Z?2(v2); 

4 - Duplicator chooses one Vi € V) and a D-coloring Di{vi). 

We say Duplicator wins the game when the two B, C, D -colored trees 

{vi,Ci{vi),Di{vi)) and (^2, ^2(^2), I>2(i;2)) 

satisfy the same first order formulas of quantifier depth e. Otherwise, Spoiler 
wins. 

Theorem 1 . For any sets V) and V2 of B -colored trees, and for any integers c, 
d, e, if Duplicator has a winning strategy for the game EF2{Vi,V2,c,d,e) then 
there exists no S2 formula ip with free set variables among B and of the form : 

3Xi---XcWi---Ya^ 

with if) a first order sentence of quantifier depth e such that any vi £ Vi satisfies 
ip while no V2 &V2 satisfies ip. 

Proof. A classical consequence of the stcindard definition of the satisfaction re- 
lation. 

□ 

In the sequel, in order to handle the winning condition given in the definition 
of FJF2-game above, we use the notion of r-type and r, m-equivalence and apply 
(the following non game theoretic version of) Hanf’s theorem [ 6 ]. 

Definition 2 . Given integer r and node t, let S{t,r) be the set of nodes t' such 
that there is an undirected path between t and t' of length at most r. The r-type of 
some node t eT in a C-colored tree v is the graph isomorphism class ofv\S{t, r). 
Given integer m, we say two C-colored trees v\ and V2 are r, m-equivalent if, for 
any r-type t, the number of nodes of r-type r in Vi equals the number of nodes 
ofr-type r in V2, or both these numbers are greater than m. 

Theorem 2 (Hanf [ 6 ]). For any e, there exist r and m such that if two C- 
colored trees vi and V2 are r, m-equivalent then they satisfy the same first-order 
formula of quantifier depth at most e. 
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4 Monadic U 2 vs. monadic II 2 

In this section, we define the define a property, called the Biichi property, and 
show that it is not definable by a monadic II 2 formula (hence its complement is 
not definable by a monadic formula). The same property is used by Rabin 
in [10] as a witness of properties definable by means of Biichi automata and 
whose complement is not definable by means of Biichi automata. 

Definition 3 (the Biichi property). Let X be a set variable. Given a {X}- 
colored tree v, we say v has the Biichi property B{X) if there is a directed path 
in T mth infinitely many nodes in the interpretation of X in v. 

Equivalently, denoting by Acc(t,u) the property stating that there is a di- 
rected path from t to u, a {X}-colored tree satisfies the Biichi property if and 
only if the greatest solution of the set equation Y = Fx{Y) with 

Fx{Y) = {t e T :3uG.Y,Xe. v{u) At ^ uA Acc{t, u)} 

is non empty. Although not very common, such a definition leads to the definition 
of the Buchi property by transfinite induction that we shall use later for our main 
result. 

Definition 4. For any countable ordinal a, let us define F^iT) as the set T 
when a = 0, the set Fx(F^'-{T)) when a = oi -t- 1 and the set riQi<Q F^^{T) 
when a is a limit ordinal. Let then Ba(X) be the property of {X}- colored trees 
stating that Fx (T) is non empty. 

Proposition 1. (1) For any countable ordinal a there are {X]-colored trees 
that do satisfy Ba{X) while they do not satisfy B{X). (2) A {X}-colored tree 
satisfies the property B{X) if and only if, for any countable ordinal a, it does 
satisfy Ba(X). 

Proof. Fact (2) follows fi:om the fact that the binmy tree is countable. To prove 
fzict (1), let us define, for each countable ordinal q, the {A’}-tree Va by the 
following induction. When a = 0, Va{t) = 0 for each t e T. When a = ai -|- 1, 
^a(e) = and for any t G T, Ua(O.t) = 0 and Uc(l.t) = Va^{t). When a 
is a coimtable limit ordincd, given an arbitrary enumeration oq, oi, ... , of all 
ordinals strictly smaller than a, Va(e) = {A”}, and for any t G T, any n 6 IN, 
Uc(0".l.t) = Va„ {t). None of these Va satisfies the Buchi property while, for each 
ordinal a, Uq+i does satisfy Ba{X). 

□ 

Before stating and proving the m 2 iin theorem below, we give some more 
technicahties in order to deal with EF^-games. These definitions are inspired by 
similar techniques used by the second author in [8]. 

Definition 5 (r,m- type constraint). An r,m-type constraint is a function f 
which associates to each r-type t of its domain a number f{T) < m. We say 
is compatible with type constraint f when, for any r-type t in the domain of f, 
n|jR has at most f{r) nodes of type r. 
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Definition 6 (D-closure of a C-colored finite region). Given integers r 
and m, given a C-colored tree v, given a set D of variables disjoint from C, 
given finite regions RCR^CT, we say R\ is a D-closure of il (in v) if, for any 
r,m-type constraint f and for every D -coloring vo of R, if there is a D-coloring 
vi of Ri which equals vq on R and such that the C, D-coloring (v,vi) of Ri is 
compatible with f, then there is a similar D-coloring of the complete binary tree. 

We want to show that indeed D-closures exists; to this goal we begin with a 
lemma: 

Lemma 1. Given RCT,va C-colored tree and vi a D-coloring of R. Given f 
an r, m-type constraint. If every finite subset of the binary tree has a D-coloring 
which equals vi on R and, together with v, is compatible with f, then the entire 
binary tree has a similar coloring. 

Proof. By a classical compactness argument. 

Proposition 2. For any integers r and m, for any C-colored tree v, for any 
finite region R C T colored by v, and for any set D of fresh variables, there 
eocists a D-closure of R. 

Proof. Let v, R and D be as above. For each r, m-type constraint /, and for 
each £>-coloring w of R, applying the previous Lemma, either there exists a D- 
coloring of the entire binary tree which, together with v, is compatible with the 
constraint / and extends w, or there exists a finite region Rf,w with no such 
D-coloring. One can check that the union of all these Rf,w is a D-closure of R. 

□ 

We can now state and prove our meiin theorem. 

Theorem 3. Over {X}-colored binary trees with successor functions, the Biichi 
property B{X) is not definable by any monadic II 2 formula. 

Proof. Assume B{X) is definable by a II 2 formula. This means B{X), the com- 
plement of B{X), is definable by a E 2 formula; henceforth, applying Theorem 1, 
Spoiler has a winning strategy on the gjune EF 2 {B{X),B[X),c,d,e) for some 
integer c, d and e. We show below that Duplicator has a winning strategy for 
this game hence this assiunption is absurd. 

Let C and D be disjoint sets of vEiriables distinct from X with |C| = c and 
|D| = d, and let r and m be given by Hcinf’s theorem (Theorem 2) for classical 
EF-games in e rounds over {X}, C, D-colored binary trees. We want to show that 
the second order rounds can be played (by the Duplicator) so that the resulting 
pair of {X}, C, D-trees chosen at the end (by the Duplicator) are r, m-equivalent. 

(1) First, for each v € D(X), Spoiler chooses a C-colored tree Ci{v). 

(2) The duplicator’s answer is the choice of an {X}-colored tree V 2 G B{X) 
together with the choice of the C-colored tree C 2 {v 2 ). This answer is made using 
the following lemma. 
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Lemma 2. There exists a sequence {{Rni'Wn,Wn,tn)}n€iN where i?„ C T is a 
finite subset of the binary tree, Wn is a set of {X}, C -colored trees of the form 
{v,C\{v)) with V G B{X) and tn G T, and such that T equals the union of all 
Rn and for each n : 

1. Wn : i?n — > V{C U {X}) is a {X},C -coloring of R„, 

2. all {X}, C-colored trees of W„ are compatible with Wn 
and W„+i C Wn, 

3. Rn+i is a D -closure of Rn in every element ofWn+i, 

4- tn is a proper prefix oftn+i and belongs to Rn+i, 

5. for each countable ordinal a, there exists w G Wn such 

rooted at tn and {X^, C-colored by w satisfies Ba{X). 

Proof. By induction on n. Let us take /io = 0, the set of {X},C-trees 
{v, Ci{v)) for each v G B{X) such that X G v{e) for e the root of the binary tree 
and to == Property 5 is ensured by Proposition 1. 

Assume the construction has been made up to some integer n. For each 
t' G T strictly above t„, each finite R' R (that also contains t„ and any node 
at a distance at most n from the root) and each {X}, C-coloring w' of R', let 
us define the set 0{t' , R' ,w') of all countable ordinals a such that there exists 
(u, ( 71 ( 1 ;)) G Wn such that : 

1. X G v{t'), 

2. (i;, C7i(t;))|f?' and w' are equal, 

3. the {X}-colored binary tree rooted at t' and colored by v satisfies Ba{X), 

4. R' is a D-closure of Rn w.r.t. (u, Ci(v)). 

By the induction hypothesis, for each coimtable ordinal a there exists w G Wn 
such that the binary tree rooted at colored by v has the Ba+i{X) property. 
Also by the definition of Ba+i, there exists some t' strictly above as in item 
3; hence we have 



and color tn by X, 



that the binary tree 



U 0{t’,R’,w') = i^i 

Since there are possible values for t', R' and w', one of these 0{t',R',w') 
has size Mi and we take in this case t„+i = t', Rn+i — R' and w„+i — w' . The 
set Wn+i is defined as the set of all w G W„ such that wliJn+i equals Wn+i, 
X G w{tn+i) and Rn+i is a D-closure of Rn in w. 

□ 

Proof of Theorem 3 (continued). Now, let us define the {X}, C-tree {v 2 , C 2 {v 2 )) 
chosen by Duplicator as Un^"- Notice that V 2 does satisfy the Biichi property 
since for each n, X G V 2 {tn)- 

(3) Spoiler chooses a D-colored tree D 2 {v 2 ). 

(4) Duplicator’s answer is made as follows. 

We note that there exists a finite subset R C T such that the {X},C,D- 
colored tree ( 02 , 02 ( 02 ), 02 ( 02 )) is r, m-equivalent to its restriction to R. 
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So let n be an integer such that R C Rn-i and let / be the minimal (w.r.t. the 
usual order on partial functions) r,m-type constraint satisfied by the {X}, C, D- 
colored tree above. 

Duplicator chooses some (any) {X},C-colored tree w G Wn with w = 
(i^i) C'i(^^i)) (henceforth (ui|ii„) equals (v2\Rn)). An adequate D-coloring Di(i;i) 
can be chosen as follows. 

By the definition of D-closure, since Rn is the D-closme of i?„_i with respect 
to the {X}, C-coloring (ui, C'i(ui)), there exists a D-coloring D\{v\) of the bi- 
nary tree that equals D2{v2) restricted to Rn and such that (t;i, C'i(t;i), Di(r;i)) 
is compatible with /. 

Since (ui, C'i(ui), Di(ui)) and {v2,C2{v2),D2{v2)) restricted to R, Rn-i or 
Rn are r, m-equivalent and since / has been chosen minimal w.r.t. the r-types 
of nodes of these partially colored trees, the full {X}, C, D-colored trees 

(r;i,Ci(?;i),Di(t;i)) and {v2,Ci{v2),Di{v2) 
axe r, m-equivalent. 

Prom this point, Hanf’s theorem applies showing that Duplicator wins. 

□ 



5 The case of the boolean closure of and II 2 

In order to show that some M5-definable formulas are not definable by any 
Boolean combination of II2 and S2 formulas, let us first prove a normal form 
theorem for these boolean combinations. 

Lemma 3. Any Boolean combination of S2 and II2 formulas is equivalent to a 
formula of the form tti A (<ti V (7T2 A {a2 V • • • V (7 t„ A cr„) • • • ))) with tt^s monadic 
II2 formulas and (JiS monadic S2 formulas. 

Proof. Since both B2 and II2 are closed (up to equivalence) under conjunction 
and disjunction. Boolean combinations of S2 and II2 formulas are equivalent to 
finite conjunctions of formulas of the form tt V a for ct G X2 Md tt G 172. It 
follows that it is sufficient to show that the set of formulas of the desired form is 
closed (up to equivalence) under conjimction with formulas of the form (tt V <r). 
We show this by induction over n. 

For n — 0 nothing has to be proved. Assume it is true up to rank n and let 
ifn+i = A (<Ti V (fn) be a formula of rank n + 1 with tpn a formula of rank n. 

First, we may assume that |= (-'<ri) and \= q>n (~'^i)- Otherwise, 

we can take V -'(tti) instead of o\ and A tti A -'O’l instead of obtaining 
thus an equivalent formula with the Scune shape and the desired property. 

Given then tt G 772 and cr G X2, one can check the formula tti A ((cr A tri) V 
((tt V (-icri)) A (cTi V {q>n A (tt V a)))) is equivalent to ^Pn+i A (tt V a). Applying 
the induction hypothesis over A (ir V cr) concludes the induction step. 

□ 

Let us then define the predicate P(X, x) as the one stating that the binary 
tree rooted in node x satisfies the Buchi property with respect to the set vari- 
able X. 
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Proposition 3. For any node t eT, the property P{X, t) is not definable by a 
monadic II 2 formula. 

Proof. By applying the argument of Theorem 3 to the binary tree rooted in t. 

□ 

Now let us fix two variables X and Y and let us define the predicate A{X, Y, z) 
as the least ^ monadic predicate equivalent to P{X, z.0)v(P(y, z.0)Ayl(X, Y, z.l)), 
where P(Y, z.O) denotes the negation of P(Y, 2 .O). 

The property A{X,Y,e) is definable in MSOL since it is definable in the 
mu-calculus (see [7]). However : 

Theorem 4. Property A{X, Y, e) is not definable by any boolean combination of 
S 2 and II 2 formulas. 

Proof. Let us assume the converse and let us show this leads to a contradiction. 
Assiune A{X, Y, e) is equivalent to a Boolean combination of II 2 and E 2 formulas. 
Applying Lemma 3 assume such a formula is of the form tti A (<ti V (7T2 A (ct2 V 
A (o„) -..)))). 

Applying the fixpoint definition of A{X, Y, z) n times shows that A{X, Y, e) 
is equivalent to 

P{X, 0) V (P(Y, 0) A (• • • P{X, r~K0) V (P(Y, l"-^0) A A{X, Y, !")••• ))) 
denoting by o*, as usual, the word composed of k times the letter a. 

Let then Xi, ... , X„, X' (resp. Yi, . . . , Y„, Y') be some new fresh variables. 
There exists a monadic Ui formula over the free variables A, Xi, . . . , 
and X' that checks whether X equals the disjoint union of the intersection of 
X' with the subtree rooted in 1" and, for each k € [l,n], the intersection of 
Xk with the subtree rooted in (and similarly a formula tpy for Y). It 

follows that there also exists, for each fc € [l,n], a 772 formula 7r(, (resp. the 
E 2 formula cr^) over the free variable X\, ... , X„, X' and Yi, ... , Y„, Y' 
that checks formula tt* (resp. formula ak) does hold with X and Y implicitly 
defined by <px and From these definitions it follows that the new formula 
A' {Xu • • • , A„, A', Yi, • • • , Y„, r, €) defined by 

P(Ai, 0) V (P(Yi, 0) A (■ • • P(A„, l"-i.O) V (P(Y„, l"-^0) A A(A', Y', 1")) • • • )) 
is equivalent to the formula 7 t( A (a'l V {tt'2 a ((t^ V (• • • < a «) • • • )))). 

The next step is to show, by induction over fc, that there exists a 
{Ai, Yi, • • • , Afc, Yfc}-colored tree i/fc, such that, for each i £ [1, fc] : 

Vk h VAfe+i, Yfe+i, • . • , A„, Y„, A', Y', A -a') 

while 

Vk N ^P{Xi, Y-^O) A P{Yi, r-i.o) 

^ or the greatest, it really does not matter in the following proof . . . 
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With k = n this leads to a contradiction since it says in particular that, for any 
i G [1, n], 1= VX', Y' , y? 7T ■ from which it follows that 

Vn h VX', Y', 7ri A {a[ V (ttj A (o-j V (• • • < A {a'^) ■ ■ ■ )))) 

hence, by the definition of the (T*,s and tt^s, 

f-vx',y'd'(Xi,-.- ,x„,yi,-.- ,y„,x',r,e) 

Also, for each i G [l,n], ~'P{Xi, 1*“^.0) A P(Vi, 1*“^.0) hence, 

|=vx',rA(x',yM") 

This is obviously absurd since, given any {X', y'}-tree v' that does not satisfy 
A(X',Y', 1") (and there are some) one has (u„,t;') |= ~'A(X',Y', 1") hence 

Vn\=3X',Y'^A(X',Y',r) 

The proof by induction goes as follows. For k = 0 nothing has to be proved. 
Assume the induction hypothesis for some k. In particular, the formulas 
p(Xfc+i,i'=.o)v(P(yfc+i,i''.o)A{-.. 

P(X„, i”-^0) V (P(y„, 1"-1.0) A A(X', r, r) ■ • • ))) 

and ir'k+i A V (• • • A (uj,) • • • )) are equivalent over any 
{Xi, • • • , X„, Ti, • • • , Yn,X', y'}-tree compatible with Vk- 
It follows that 

Vk hvxfc+i,yfc+i,--- ,x„,yn,x',y'(P(Xfc+i,i*.o):^7r;+i) 

That is 

hvXfc+i(P(Xfc+i,i''.0)=^(vyfc+i,--- ,x„,y„,x',y'7r;+i)) 

Since P{Xk+i, l^'.O) is not definable by a monadic II^ formula, the above impli- 
cation is strict in the sense that there exists a {Xfc+i}-tree v' such that 

{Vk,v') f- (vn+i,--- ,X„,y„,X',y'7r;+i)A-P(X,t+i,l''.0) 

In other words, for any colored tree w compatible with {vk, v') one has 

p{Yk+i, I^O) A (• • • P(X„, i"-^0) V (P(y„, r-'.o) a a(x', y, i") • ■ • )) 

equivalent to V (• ■ • ttJ, A (crjj) • • • ). Again, in particular, 

{Vk,v') ^VYk+i--- ,Xn,Yn,X’,Y'{a’k+i^P{Yk+ul'’-0)) 

hence 

N vyt+i(-p(yfc+i,i'‘.o) (vXfc+2,yfc+2,‘ • • ,x„,y„,x',y'-(T;t+i) 

Now, since -iP(yfc+i, 1*^.0) is not equivalent to a monadic II 2 formula, there 
exists a {yfc}-tree v" such that 

{vk,v\v")\=p{Yk+iA'‘-0)A{'iXk+2,Yk+2,--- ,x„,y„,x',y'X+i) 

Putting Vk-tri — (vk,v',v") concludes the induction step. 

□ 
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6 As a conclusion . . . 

One may notice that all the material presented in last section can be generalized. 
The following exercise is one possible formulation of such a generalization. 
Exercise. Let C be the class of (even only finite) directed graphs with a source 
such that all nodes axe reachable from the source. For any integer n > 2, if there 
exists a monadic definable property of graphs of C which is not monadic 
i7„ definable, then there also exists a monadic An+i graph property (that is a 
property both definable by a monadic En+i formula and f7„+i formula) which is 
not definable by a (finite) Boolean combination of monadic i7„ and formula. 
Hint ; think of directed comb-like constructions. 
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Abstract. We propose so called clausal tableau systems for the common 
modal logics K4, KD4 and S4. Basing on these systems, we give more 
efficient decision procedures th 2 in those hitherto known for the considered 
logics. In particular space requirements for our logics are reduced from 
the previously established bound 0(n^. log n) to 0(n. log n). 



1 Introduction 

It is well known that complexity of provability for the modal logics K4, KD4 
and S4 is PSPACE-complete [5]. Recently, some authors have analyzed space 
requirements for modal logics [4,10,1,8]. In [4] Hudelmaier translates formulae 
to clausal form and proposes a contraction-free sequent calculus, which is defined 
only for clauses and has a decreasing measiu-e for rules, for S4. Using the measure 
he has shown that provability for S4 is decidable in 0(n^. logn)-space. Basing 
on labeled sequent systems, Vigano in [10] and Basin, Matthews and Vigano in 
[1] have given decision procedures for the logics K4 and KD4 using 0(n^. log n)- 
space. 

In this paper we propose so called clausal tableau systems for the common 
modal logics K4, KD4 and S4. Following Hudelmaier, our systems are defined 
only for clauses. This simplifies proofe of completeness and gives a good method 
to estimate space bounds for modal logics. We give algorithms that for a L- 
satisfiable set X of clauses, where L is K4 or KD4 or S4, construct a L-model 
for X that has a frame diameter of order 0{n). Analyzing the algorithms allows 
us to establish a lower space bound, 0(n. log n), for the considered logics. 

This work is based on the work of Hudelmaier [4]. What makes our space 
boimds lower than the one of Hudelmaier is that we deal with constructing 
models and with saturation. The idea is that if we consider the saturating oper- 
ation to be atomic then the search tree will have a depth of order 0{n) instead 
of 0{'n?). Due to the lack of space, proofs have been omitted or considerably 
shortened; see [8] for details. 
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2 Preliminziries 



2.1 Syntax and Semantics Definition for Modal Logics 

A modal formula, hereafter simply called a formula, is any sequence of these 
symbols obtained from the following rules; any primitive proposition pi is a 
formula, and if <f> and tj} are formulae then so are {<p V xp), and (□(/>). 

We use letters like p and q to denote primitive propositions. We call formulae 
of the forms p or ->p classical literals and use letters like a, b, c to denote them. 
We call formulae of the forms a, Oa, or -iDa atoms and use letters hke A, B, C 
to denote them. A simple clause is an atom or a disjunction of atoms. We write 
[Ai, . . . , Ak] to denote the simple clause Ai V . . . V A*,. If (/> is a simple clause 
then where s > 1, is a clause. We use Greek letters like <j>, xp, to denote 
formulae and clauses. 

A Kripke frame is a triple (W,t,R), where W is a nonempty set (of possible 
worlds), T G W is the actual world, and il is a binary relation on W. If {w, w ) € 
R then we say that the world w is reachable from the world w. A Kripke model 
(resp. model graph) is a tuple {W,r,R,h) (resp. {W,r,R,H)), where {W,r,R) is 
a Kripke frame, and h (resp. H) is a mapping from worlds to sets of primitive 
propositions (resp. formulae); that is, h{w) (resp. H{w)) is the set of primitive 
propositions (resp. formulae) which are “true” at the world w. We sometimes 
treat model graphs as models with h being H restricted to the set of primitive 
propositions. 

A world ly in a model graph M is said to be inconsistent if there is a prim- 
itive proposition p such that both p and -<p belong to H{w). A model graph is 
consistent if it contains no inconsistent world. 

Given some Kripke model M = (W,T,R,h), and some w G W, we write 
M,w N p iff p G h{w), and say that p is true at w in M. This satisfaction 
relation N is then extended to more complex formulae as follows: 



M,wi= p 
M,wi= ->(p 
M, <pV xp 
M,xjo 1 = U(p 



iff p G h{xv)\ 

iff M,xu\^ <p\ 

iff M,xv ^ (p 01 M,xu xp; 

iff for all u G W such that R{xv, v), M,v\^ <p. 



We say that M satisfies <p at xv iS M,xv ^ (p. We say that M satisfies (p, or <p 
is satisfied in M, iff M, r N 0. 

In this paper we will consider the conunon modal logics K4, KD4 and S4. 
These logics require the accessibility relation to be transitive. Additionally, S4 
requires the accessibility relation to be reflexive, and KD4 requires the acces- 
sibility relation to satisfy the formula Vx 3y R{x,y). For L being one of the 
considered logics, we call these restrictions L-frame restrictions. 

We call a model M a L-model if the accessibihty relation of M satisfies all 
L-frame restrictions. We say that <p is L-satisfiable if there exists a L-model of 

4 >- 

For L being K4 or S4, and for a binary relation R , we write Exti{R ) to 
denote the least extension of R that satisfies all L-frame restrictions. It is clear 
that this operator is well defined. 
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2.2 Modal Clauses 



We write (t>i\ 4>2 \ ■ ■ ■ \<i>k to denote the set {4>\,<!>2, ■■■ > Prom now on, we use 
letters like X, Y, Z to denote sets of formulae or clauses. We write .X’; P to denote 
For X = {</>!, 02, • • • ,0*:}, we write to denote □®0i; n*02; • • • ; n*0fc. 

We define modal-depth of a formula, denoted by mdepth, as follows: mdepth{a) 
= 0, mdepth{0(p) = mdepth{<j)) + l, mdepth{-^<j>) = mdepth(4>), mdepth{(j)\/ ij}) = 
mdepth{(j); Ip) = max{mdepth{(p),mdepth{ip)). For a clause 0 = where ip is 
a simple clause, we define restrictive length of 0 to be the length of the clause 
□0, and denote it by rlength{(p). We define restrictive length of a set of clauses 
to be the sum of restrictive lengths of its elements. Restrictive length can be 
understood as length in the common sense if sequences of are considered to 
be primitive connectives. 

We call two sets of formulae X and Y equisatisfiable in a logic L iff (X is 
L-satisfiable iff Y is L-satisfiable). 

Lemma 1. Let p be a primitive proposition which only occurs at the indicated 
positions. Then the following pairs of sets of formulae are equisatisfiable in any 
normal modal logic: 



X;D^[0,0VC] and X;Dn0,0,C] 

X;D*[0,-^-h 0] and X;a*[0,0] 

X; a^[0, -(0 V 0] and X; D*[0, -pj; □>, -0]; 0^\p, -C] 
X;a'’[0,^D0] and X; □*[0, ^Dp]; ->0] 

X;D'’[0,0] and X; D*[0,p]; n*[-.p,0] 



This lemma is well known (cf. [6]) cind using it we can translate any formula 
0 to a set X of clauses such that: 

— X is a set of clauses of the form □®[ai,... ,afc], where s > 0, fc > 1, or 
□*[a, R], where s > 0 and B is an atom of the form □!) or ->□6; 

— 0 and X are equisatisfiable in any normal modal logic; 

— the modal-depth of X is equal to the modal-depth of 0 and the restrictive 
length of X is linearly bounded by the length of 0 (note that X can have 
quadratic length). 

During translation we treat sequences of □* as primitive connectives, which 
allows us to do the task and encode X in 0(n. logn)-space. 



2.3 Syntax of Clausal Modal Tableau Systems 

Our formulation of tableau systems is based on the work of Hintikka [3], Rauten- 
berg [9] and Gore [2], and is defined only for sets of clauses. Given a formula, we 
can translate it to an equisatisfiable set of clauses to be usable by our systems. 

A tableau rule R consists of a numerator N above the line and a (finite) list 
of denominators D\, D 2 , ■ ■ ■ , Dk (below the line) separated by vertical bars. 

N 



L>i |F>2 I ... \ Dk 
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The numerator is a finite set of clauses and so is eaeh denominator. As we shall 
see later, each rule is read downwards as “if the numerator is L-satisfiable, then 
so is one of the denominators”. 

A tableau system (or calculus) CL is a finite set of tableau rules. A CL-tableau 
is a tree with nodes carrying sets of clauses such that if x is a node carrying 
X and j/i, j/ 2 ) • • • Vk are all child nodes of x that carrying Ti, Y 2 , ... Yk, then 
there exists a CL-rule R such that R has k denominators and X is an instance 
of the numerator of R and 1 < i < A:, is the corresponding instance of the 
denominator number i of R. 

A branch in a tableau is closed if it ends with ±. A tableau is closed if all of 
its branches are closed. A tableau is open if it is not closed. A set X of clauses 
is said to be CL- consistent if every CL-tableau for X is open. If there is a closed 
CL-tableau for X then we say that X is CL-inconsistent. 

A tableau system CL is said to be sound if for any set X of clauses, if X is 
L-satisfiable then X is CL-consistent. A tableau system CL is said to be complete 
if for any set X of clauses, if X is CL-consistent then X is L-satisfiable. 



3 Clausal Tableau Systems for K4, KD4 and S4 

Tables 1 and 2 represent clausal tableau rules and calculi for the modal logics 
K4, KD4 and S4. The connective □ has the following semantics: M, w N 00 
iff M, u; N (□<^; <j>)\ i.e. 0(^ is a shortened form of □(/>; (j>. When dealing with the 
logic S4 we assume that the language does not contain the connective 0. The 
calculi CK4 and CKD4 are built in a similar way as CS4; the rules {KA ), {Kr), 
{KAa), (KAh) and {KAc) are similar respectively to (iC4), (T^), (54o), {SAbj and 
(54c). Note that in S4 we have 0^4> = 0<f>. The proofs of soundness of the calculi 
are straightforward. 



3.1 Completeness of CS4 

Definition 1. Let X be a C S4-consistent set of clauses. Let Y — X and repeat 
the following steps until Y does not change: 

— If Y is an instance of the numerator of the rule (V) then one of the cor- 
responding instances of denominators of (V), lets say Z, must be CS4- 
consistent. Set Y — Z. 

— Let R be (54q) or (54;,). If Y is an instance of the numerator of R and the 
corresponding instance Z of the first denominator of R is C S4-consistent, 
then set Y = Z. 

It is easily seen that this process always terminates. We call Y a first kind CS4- 
saturation of X. 

Definition 2. Let X be a C S 4- consistent set of clauses. Let Y = X and repeat 
the following steps until Y does not change: 
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A;a 1| . . . I A; At 

<!^ = [«i. • • • > «fc] 



X; g; -ig 



X; DF; -nOa 



^ X;ay;a"fa,^a6] . X;DF;D‘'-.Da 

^ X;ny;Da| DF; b^[a, -.D6]; -i6 ^ DF; □*-.□ 0 ; -.0 



(^) ^ 



X;D*fa,D61 
X; D6 I X;D-’[a, n6];a 

X;DF;D‘'-.Da 



X; DF; Q^Z; -.Oa 



/r^T^.N X;DF;a^X;QC/ 

(^^4) 



where F is a set of simple clauses 
0 = (ai, • • • . «ifc] 

i X;0fa, □&! 

X;Dfe|X;Q[a,U];o 

^ X; BF; OZ-, O^U-, Bk -.□6] 

^ ’’’ X; BF; OZ; D^f/; Ba] B F; BX; DU-, B[o, -~Db]^-^b 

,r^. ^ X;BF;aZ;D^{7;B-.Da 

□ h a ^ ' io^;n-Oa;-^a 

where Z is a set of simple clauses 
Table 1. Clausal tableau rules for K4, KD4 and S4 



CL Rules 



CK4 (V), (±), {Kr), (X4a), (X4»), (X4c), (X4) 

CKD4 (V), (±), {Kr), (X4„), (X4fc), (X4c), (X4), (XZ?4) 
CS4 (V), (±), (T.), (54.), ( 545 ), (S4e), (X4) 



Table 2. Clausal tableau calculi for K4, KD4 and S4 
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— Let R be (V) or {Ty). IfY is an instance of the numerator of R then one 
of the corresponding instances of denominators of R, lets say Z, must be 
CS 4- consistent. Set Y = Z. 

— IfY is an instance of the numerator of the rule (54a) then set Y to be the 
corresponding instance of the second denominator of (54a). 

It is easily seen that this process always terminates. We call Y a second kind 
CS4-saturation ofX. 

In the following algorithm, and in Algorithm 2 as well, we assume that we 
have oracles to compute the steps 2b and 4. 

Algorithm 1. Let X be a CS 4- consistent set of clauses. We construct a con- 
sistent S4-model graph M = {W,r,R,H) that satisfies X as follows: 

1. Set W - {r}, Hq{t) = X , Ro — Ri = H), and mark r as unsolved. 

2. (a) Take an unsolved world w from W. 

(b) Let Hi{w) be a first kind CS4-saturation of Hq{w). 

(c) For every atom -iDa from H\{w): 

— Create a new world Wa, add it to W and mark it as unsolved. 

- Set Rq = Ro (J{(^) “'a)}- 

- Set Hoiwa) = 

(d) For every clause <f> € Hi(w) such that (f> = where s > 1 and 
(ip = [a, -'□6] or Ip = 

- LetY = {-^b}\J {aZ\aZ€Hi{w)}. 

- If there exists a world £W such that = w or Ro(w^, w), and 

Y C then set i?i = IJ {{w,w^)}; 

- else 

• Create a new world w^, add it to W and mark it as unsolved. 

• Set Ro = Ro U{(^>^^)}> Ho{w^) = Y. 

(e) Mark w as resolved. 

3. While there are unsolved worlds, repeat the step 2. 

4 . For every w € W, let H{w) be a second kind C S4-saturation of Hi(w). 

5. Set R — Extsi{Ro\JRi)- 

In the step 2d, we use Rq to denote the transitive closure of Ro. The step 
2c corresponds to the rule (A4), the step 2d to the rules (54;,) and (54c). It is 
easily seen that the number of possible contents of nodes is finitely bounded. It 
follows that this algorithm always terminates. 

Lemma 2. Let M be the model graph constructed by Algorithm 1. Then M is 
a consistent S4-model graph and for any w ^W and <p S H{w), M,w 1= (p. 

Proof. It is clear that M is consistent and R satisfies all S4-frame restrictions. 
Suppose that (p 6 H{w), we show that M,w\= <p. 

It is easily seen that (p cannot be of the form [Ai, . . . , Ak] with fc > 1. 

Case (p = -iDa : We have Ro{w,Wa) A {-<a) € Ho{wa). Hence M,w\= (p. 
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Case (f> = Q*[ai, . . . , at], where s > 1 : We claim that for any u € W such 
that <j) S Hi{u) the following assertions hold; 

V, [fli , . . . , 

Vn Ro{u,v) <f)£ Hi{v) 

Vu Ri{u,v) (j) £ Hi(v) 

'iv R{u, v) (M, p t= [ai, . . . , Ok]) 

It follows that Vii 6 W 4> £ Hi{u) {M,u 1= <j>). Since (j> £ H{w), it is easily 
seen that <p € Hi{w). Therefore M,w N <f>. 

Case <f> = □®[a, 06], where s > 1 : We claim that for any u £W such that 
(j> £ Hq{u) the following assertions hold: 

□6 £ H{u) V a € H{u) 

Vu Ro{u,v) (j) £ Ho{v) y Ob £ H{v) 

Vu Ri(u,v) (j) £ Ho(v) V 06 £ H{v) 

It follows that yu £ W <p £ Ho{u) (M,u \= <j>). Since (j> £ H{w), it is easily 
seen that (f> £ Hq{w). Therefore M,w\^ 

Case (t> = 0*[a, -'06], where s > 1 : We claim that for any u £W such that 
4> £ Hi (u) the following assertions hold: 

3r; R{u,v) A (-'6) € H{v) 

yv Ro{u,v) <f) £ Hi(v) V Do £ Hi{v) 

yv Ri(u,v) -></>£ Hi(v) 

It follows that yu £ W 4> £ Hi(u) M, u\= 4>. Since (j) £ H{w), it is easily seen 
that <j) £ Hi{w). Therefore M,w\= 4>. 

Case (f) = D®-iDa, where s > 1, is similar to the preceding case. 

Corollary 1. Let X be a CS 4- consistent set of clauses. Let M be the model 
graph constructed by Algorithm 1 for the input X. Then M is a S4-model of X. 

Proof. By Lemma 2, M is a consistent S4-model graph and M t= H{t). Since 
H{t) is a second kind CS4-saturation of ffi(r), which is a first kind CS4- 
saturation of X, we conclude that M 1= X. 

We arrive at 

Theorem 1. The calculus CS4 is sound and complete. 



3.2 Completeness of CK4 and CKD4 

In this subsection we use L to denote K4 or KD4, and we write CL to de- 
note the corresponding calculus. We define the first and the second kind of 
CL-saturation in the same way as for CS4 as in Definitions 1 and 2, with (54a), 
(54f,) and (T^) replaced by (X4a), (Xdj,) and {Kr), respectively. We will use 
Create AN ewW or ldFrom{w,x) to denote the following: 
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— Let a: be a new world, add it to W and mark it as unsolved. 

- Set i?o = .Ro U {(^) 2 ;)}- 

Algorithm 2. Let X be a CL-consistent set of clauses not containing the con- 
nective □ . We construct a consistent CL-model graph M = (W,t,R,H) that 
satisfies X as follows: 

1. Set W — {r}, Hq{t) = X , Ro = Ri — <h, and mark t as unsolved. 

2. (a) Take an unsolved world w from W. 

(b) Let Hi{w) be a first kind CL-saturation of Hq{w). 

(c) For every atom -iDa from Hi{w): 

- CreateANew WorldProm(w, Wa). 

- Set Hoiwa) = {-a} \J{aZ \ G Hi{w)} U 

U {QF I OY G H{w) and F is a set of simple clauses} 

(d) If the connective □ occurs in Hi(w) then: 

For every clause (p G Hi{w) such that <j> — Sip, where s > 1 and (ip = 
[a, -lOft] or Ip = -<nb): 

- CreateANew WorldFrom(w, ). 

- Set LfoK) = {-6}U I D^U G iriH}U 

{□F I □ F G Hi(w) V OF G Hi{w) 
and Y is a set of simple clauses}. 

(e) If the connective □ does not occur in Hi{w) then: 

For every clause p G H\{w) such that p = □V’, where s > 1 and (p = 
[a, -lOft] or p = -<Ob): 

- LetY = {-^b}\J {SZ I SZ£Hi{w)}. 

— If there exists a world w,f, £W such that w,f, = w or Rq{w^, w), and 
Y C then set = i?i jj {{w,w,/,)}; 

- else 

• CreateANew WorldProm(w, w^j, ). 

• Set Ho{w,f,) = Y. 

(f) If L is KD4 and there is no u such that i?o(u;,u) or Ri{w,u) then 

— If the connective □ occurs in H\(w) then 

• Create AN ewWorldFrom(w,w ). 

• Set Ho{w') = {az I a^Z G Hi{w)} (J 

{□F I □ F G Hi{w) V DF G Hi{w) 
and Y is a set of simple clauses}. 

- else set i?o = Ro U 

(g) Mark w as resolved. 

3. While there are unsolved worlds, repeat the step 2. 

4 . For every w G W, let H{w) be a second kind CL-saturation of Hi{w). 

5. Set R = ExtKA{Ro\jRi)- 

The step 2c corresponds to the rule {K4 ), the steps 2d and 2e to the rules 
{KAp) and (A4c), and the step 2f to the rule {KDY). Let I = rlength{X) and 
s = mdepth{X). On any path in the tree Rq, there is at most one world created 
at the step 2c (with w — t), and at most I worlds created at the steps 2d or 2f, 
with depths greater than s. Moreover, the number of possible contents of nodes 
is finitely bounded. It follows that this algorithm always terminates. 
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Lemma 3. Let X be a CL-consistent set of clauses not containing the connective 
□ . Let M be the model graph constructed by Algorithm 2 for X. Then M is a 
L-model of X. 

The proof of this lemma is similar to the proof of Lemma 2 and Corollary 1, so 
we omit it. We arrive at 

Theorem 2. The calculi CK4 and CKD4 are sound and complete. 

4 Space Bounds for the Logics 

Lemma 4. Let X be a C S4-consistent set of clauses, with rlength{X) = n. Let 
Ro be the relation computed by Algorithm 1 for X. Then Rq is a tree with a 
depth bounded by n. 

Proof. Suppose that the depth of Rq is greater than n. It is easily seen that there 
exist two different worlds w, u such that; Rf){r, w)ARq{w, u), H\(w) = {-'f)} U Y, 
Hi{u) = {-‘b}\JZ, for some b, Y, Z such that Y ^ Z and Y and Z contain 
only clauses of the form □*(/>, where s > 1. Since Y ^ Z, either the rule (54a) or 
(SAb) must be applied on the path from w to u. Observe that if £ H\{w), 
where s > 1, then: 

- If (/> is of the form [ai, . . . , a/j] then 0^4> € Hi (u). 

- If is of the form [a, D6] then either or Of) belongs to Hi(u). 

- U <f> is of the form [a, ->□6] then either 0^<j> or Do belongs to Hi{u). 

This contradicts the fact that Hi{w) is a first kind CS4-saturation of Ho{w). 

Consider the nondeterministic algorithm obtained from Algorithm 1 by delet- 
ing the steps 4 and 5, ignoring computing W and R\, and replacing the step 2b 
by: 

“Nondeterministically choose a candidate for first kind CS4-saturation of 
Ho {w) and assign it to H\{w). If all candidates for second kind CS4-saturation 
of Hi{w) are inconsistent in the sense that they contain both p and ^p for 
some p, or if the depth of w in the tree Ro is greater than n, then reject the 
computation.” 

It is easily seen that for any set X of clauses, X is S4-satisfiable iff the new 
algorithm has an unrejected computation for X. We can simulate the mentioned 
algorithm by a deterministic one, denoted by Ai, by backtracking. During com- 
putation of Ai for X we only need to keep information about the current path 
from the root r to the current world. 

We will treat sequences of as primitive connectives. Let n be the restrictive 
length of X. For each world on the current path, we need 0(logn)-space to keep 
information about “and” branching from its parent to it, i.e. to keep the position 
either of the atom -iDa, in the case of the step 2c, or of the clause (j), in the case 
of the step 2d, in the sequential display of X. For each formula (j> of the form 
□®[a, □&] (resp. n*[a, -iDb]), where s > 1, we need 0(logn)-space to keep the 
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depth at which it is replaced by D6 (resp. Da) as a result of applying the rule 
(54a) (resp. (54^)), i.e. to keep information about “or” branching caused by the 
formula. Note that formulae of the form [Ai,... ,Ak], where A: > 1, can occur 
only in the root r. 

We do not keep the sets Ho{w) and Hi{w), except for w being the current 
node or the root node. For any node w on the current path, from the content 
of the root node and the information about and-or branching (on whole of the 
path) we can reconstruct the sets Ho{w) and Hi{w) using 0{n. logn)-space. For 
this, it suffices to show that for any worlds w and u on the current path such 
that Ro{w,u), having Hi{w) and the information about and-or branching we 
can construct Hq{u) and Hi{u) using 0(n. log n)-space. It is clear that having 
Hi (w) and the information about “and” branching from it; to u we can construct 
Hq{u) using 0{n. logn)-space. Let k be the depth of u in the tree Rq. We have 
1 < k < n. The set Hi(u) is obtained from Ho{u) by replacing every clause of 
the form 0^[a, □!;] (resp. n*[a, -iDb]) for which there is information that it is 
replaced by Ob (resp. Da) at the depth k by D6 (resp. Da). It is clear that the 
task can be done in 0{n. log n)-spcice. 

For the two special worlds, the root and the current, we can use 0{n. logn)- 
spiice to keep information about them, cuid to test consistency of candidates 
for second kind CS4-saturation of the current world. Since every path has a 
depth bounded by n, we conclude that the algorithm Ai can be computed in 
0(n. logn)-space. 

Theorem 3. The logic S4 is decidable in 0{n.logn)-space. 

Proof. Just recall that using 0(n. logn)-space we can translate any formula ^ 
to a S4-equisatisfiable set X of clauses such that the restrictive length of X is 
linearly boimded by the length of 

Lemma 5. Let L be K4 or KD4. Let X be a CL-consistent set of clauses, 
with rlength(X) = I and mdepth{X) = s. Let Rq be the relation computed by 
Algorithm 2 for X. Then Rq is a tree with a depth bounded by 21 + s. 

The proof of this lemma is similar to the proof of Lemma 4. Reasoning similarly 
as for CS4, we arrive at 

Theorem 4. The logics K4 and KD4 are decidable in 0{n.\ogn)-space. 

5 Conclusion 

We have presented clausal tableau systems for the modal logics K4, KD4 and 
S4, and have shown that these logics are decidable in 0(n. log n)-space. This 
space bound is lower than those hitherto known. The method used in this paper 
is applicable for other modal logics, including K, KD, T, KB, KDB and B; 
see [7] for details. 
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Abstract. We consider the expansion property for the beises-exchange 
graph of a matroid, which gives an effective randomized approximation 
scheme for the number of beises. Mihail and Feder in [6] proved that this 
property holds for the class of so-called balanced matroids, including all 
regular matroids. We present a much simpler proof of similar result for 
all matroids satisfying exchange-preservation property, defined as some 
graph-theoretic condition on the family of internal exchange graphs. We 
apply a method of bounding path congestion proposed in [8], which is of 
additional interest as it wsts suggested in [6] that this technique seems to 
be unsuitable to matroid related graphs. As an illustrating example we 
apply our results to a subclass of transversal matroids. 



1 Introduction 

Motivation and related research. Among several approaches to the study of 
matroids, the approach emphasizing bases has an advantage that any base of a 
given matroid can be transformed by means of consecutive exchanges of single 
element into any other base, possibly in many ways. Thus it seems appropriate 
to regard each base as a vertex of a graph and each pair of bases differing by 
a single exchange as adjacent. The graph obtained in this fashion is called the 
bases-exchange graph of a matroid [4, 12]. A lot of work has been done concern- 
ing the structure of exchanging in bases. Several classes of matroids have been 
defined by their exchange property: base orderable and strongly base orderable 
matroid, binary matroid (possessing an odd number of symmetric exchanges) 
as well as the class of all matroids (symmetric exchanges and symmetric sub- 
set exchanges) [7,15,17,18]. Our main motivation was to study the structure 
of symmetric exchanges in matroid bases and its correlation with the expansion 
property of the bases-exchange graph. 

The Matroid Expansion Conjecture (see [13]) states that for every matroid 
its bases-exchange graph has cutset expansion at least 1. On the other hand at 
least inverse polynomially bounded expansion (w.r.t. the size of matroid) of such 
graph implies the existence of an effective procedure approximating the number 
of bases. The approximation scheme is based on a simulation of a Markov chain, 
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corresponding to a random walk on bases-exchange graph. In the case when this 
Markov chain is rapidly mixing, we have an effective procedure to approximate 
the number of bases called fully polynomial randomized approximation scheme [8, 
10]. It is the best we can achieve as the exact enumeration of bases is proved to 
be #P-hard [1]. 

For certain classes of matroids researchers managed to prove the rapid mix- 
ing property of this random walk [3,6,11,13], but the problem still remains 
open for the class of all matroids. The authors emphasized that matroids satisfy 
very interesting properties for single-step exchanges, which offers possibilities for 
searching an argument for rapid mixing similar to canonical path method pro- 
posed by Jerrum and Sinclair [8]. Unfortunately, these properties axe not known 
to hold for sequences of exchanges, thus it was doubted that the analogue of 
Jerrum-Sinclair type argument for expcinsion could be applied in the case of the 
bases-exchange graph. We propose to revise this opinion by showing that this 
method is applicable to certain classes of matroids. 

Results. In the paper we consider the class of strongly base orderable ma- 
troids (see [2, 17]), containing all gammoids [15], and, in particular, all transversal 
matroids. The base orderable bijection [17] between any pair of bases gives rise 
to a family of paths having common source and destination base. It is worth to 
mention that strongly base orderable matroids are for the first time investigated 
in this context. We formulate exchange-preservation property, satisfied by a sub- 
class of strongly base orderable matroids. Our main result is to show that this 
property implies expansion at least 5 of the bases-exchange graph of a matroid. 
In the proof we apply the method of Ccmonical paths, demonstrating this way 
that its applicabihty is not restricted to Markov chains with regular structure, 
as suggested in [6, 13]. We also show that the class under consideration contains 
all binary transversal matroids. 

It is surprising that exchange-preservation property, crucial in our approach, 
is similar to the strong base orderability condition - while the former property 
guarantees the expansion of the bases exchange graph, for the latter one this 
remains a challenging open problem. 

Organization of the paper. In Section 2 we review some basic facts and 
definitions from matroid theory. Then in Section 3 we introduce the notion of 
exchange-preservation property and show that it imphes a cutset expansion. 
Section 4 is devoted to study of transversal matroids. Their binary subclass turns 
out to have exchange-preservation property. We also give there an alternative 
proof of the strong orderability for trcuisversal matroids. The brief summary, 
final remarks and open problems are contained in the concluding section. 
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wishes to thank HNI for the invitation and hospitality. 
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2 Preliminaries 

Matroids. A matroid is a powerful abstraction which generalizes both graphs 
and vector spaces. A matroid M. consists of a grormd set E of which some 
subsets are declared to be independent. The independent sets must satisfy three 
properties: 

( 1 ) the empty set is independent, 

( 2 ) all subsets of an independent set are also independent, 

( 3 ) if U and V are independent and \U\ > |V| then some element of U can be 
added to V to yield an independent set. 

We will denote the set of independent sets of a matroid by I, hence Ai = (E,X). 
All maximal independent sets, called bases of Al, have the same cardinality 
(called the rank of a matroid). The set of all bases will be denoted by B. Equiv- 
alently one can also define matroid M by giving its set of bases; the family 
of independent sets is then induced as the family of all subsets of bases. For 
Bi,B2 € B we have following base zixiom; for each x E Bi \ there exists 
y e B2 \Bi such that (Bi U {y}) \ {x} € B. 

For a matroid M the bases-exchange graph Qm is a graph, nodes of which are 
all bases of M. There exists an edge between Bi and B2 in Qm iff l-Bi AB2I = 2 , 
where the symbol A denotes a symmetric difference, i.e. B2 can be obtained from 
Bi (and B\ from B2) by a fundamental excheinge operation 82 = Bi\ {x} U {y} 
(J 5 i = J 32 \{ 2 /}U{x}). Bases-exchange graph was first introduced by Edmonds [ 4 ], 
for further information the reader is referred to [ 12 , 13 , 17 ). 

In the sequel we will also use a family of bipartite graphs 

XGm = {W{Bi,B2)}bi,B2€B, 

indexed by pairs of bases B\,B2 of Ai, defined as follows. For any two bases 
Bi,B2, sets of vertices of IQ {81,82) are Bi and 82', its edges are all the pairs 
(x, y) E Bi X 82 satisfying {8i \ {x}) U {y} E B and {82 \ {y}) U {x} E B. We 
call graph 2Q{8i, 82) an internal exchange graph of bases 81,82- 

As examples of matroids we reccdl here a graphic matroid (called also a 
cycle matroid of a graph), a vectorial and an uniform one. For any connected 
graph G, we define a matroid Ai{G) as follows: the independent sets are sets 
of edges of all acyclic subgraphs (forests) of G; bases are all spanning trees of 
G. Another example of a matroids is a vectoried matroid, whose ground-set is 
the set of vectors over a field and whose bases are maximum cardinality linearly 
independent subsets of the set of vectors. In the uniform matroid of rank k over 
n-element groundset E (denoted as U^), independent sets are all subsets of E of 
cardinality < k. For a detailed treatment of matroid theory the interested reader 
is referred to [ 15 , 17 ]. 

Matroid Expansion Coryecture and counting the bases. We start 
with the definition of the expansion. 

Definition 1 . For a graph Q = {V,E) and for S QV, let F{S) denote the cut 
set in G induced by S, i.e. the set of edges in E with one endpoint in S and one 
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endpoint inV\S. The cutset-expansion (or simply expansion) of Q is: 

min 

SCV,|Sl<i^ l^l 

To the best of our knowledge, there is no counter-example to the following con- 
jectture proposed in [14]. 

Conjecture 1. For any matroid the cutset expansion of the bases- 

exchange graph Gm 1. 

The positive resolution of the above conjecture, even in a slightly weaker 
form and/or for special classes of matroids (vectorial, transversal, etc.) would be 
of major combinatorial interest. Furthermore, it would also be of fundamental 
algorithmic significance, which is argumented below. 

For a given matroid M consider the following natural random walk on Qm 
i{Xt)t=o,i,...)- If is the state (base) at time t, then with probabihty | take 
Xt+i == Xt, and with probability |, let Xt+\ be determined as follows: Choose 
e fi:om Xt and / from E uniformly at random and if X' = X \ {e} U {/} £ B 
then Xt+i = X', otherwise Xt+i = Xj. We see at once that the Markov chain 
Xt can be simulated efficiently (given an independence oracle for Ad), and that 
it converges to the uniform distribution over B (it is symmetric). Moreover, the 
expansion of Gm iuiphes that Xt has large conductance, and hence possesses the 
rapid mixing property, which amounts, roughly, to Xt approaching its stationary 
distribution arbitrarily close for t = poly-log(|B|). Therefore, the natmal random 
walk on Gm can be used as an efficient almost uniform generation scheme for 
the set of bases, and in view of the fact of self-reducibility [9, 10] of matroids we 
obtain an efficient randomized algorithm to approximately count the number of 
bases. 

Strongly base orderable matroids. The results of this paper show that 
the expansion conjecture holds for the class of all matroids satisfying exchange- 
preservation property, which is properly contained in the class of strongly base 
orderable matroids. We define a matroid Ad to be strongly base orderable if for 
any two bases Bi and B 2 there exists a bijection n : Bi —¥ B 2 such that for all 
subsets AC Bi, {Bi \ A) U 7t(A) is a base of Ad. It is easy to show that for any 
such 7T also (B2\7r(A))uA is a base of Ad. In subsequent sections bijection tt will 
be called strongly orderable bijection. In [17] it was proved that all transversal 
matroids (see Section 4) and all gammoids (for definition see [15, 17]) are strongly 
base orderable. 



3 Cutset expansion 

We consider the problem of bounding the cutset expansion of the bases-exchange 
graph of a matroid. Om: investigations Eu:e restricted to the strongly base order- 
able matroids which satisfy the exchange-preservation property (Definition 2). 
Theorem 1 claims that cutset expansion for this class is at least In order to 




336 Anna Gambin 



prove it we apply the method of bounding path congestion using the state-space 
path encoding [ 8 ]. In the next section we prove that the class of exchange- 
preserving matroids contains, in particular, all binary transversal matroids. 

Throughout this section we fix a matroid M. = {E,I)\ the set of its bases 
will be denoted by B. For any bases Bi, B2 of Ad, their internal exchange graph 
gives raise to transformations of Bi and B^ into another pair of bases B[,B2, 
differing only by one element from original bases. These transformations are 
induced by exchanging elements connected by edges oiIQ{Bi, B2). Formally, for 
every edge [x,y) G Bi x B2 of IQ{Bi,B2), we define 0 x,y{Bi, B2) = 
by B[ = Bi\ {a;} U {y} and B'2 = B2 \ {y} U {x}. Observe that for every strongly 
orderable bijection ^ : Bi B2, if #(x) = y then 0x,y(^) is also a strongly 
orderable bijection B{ B'2, where &x,y{^) denotes # with x ^ y replaced 
by 2/ x: 



=^\{(ar,2/)}U{(2/,x)}. ( 1 ) 

Roughly speaking, strongly orderable bijections are preserved by transformations 
Ox,y Two pairs of bases, (Bi,B2) and (Bj,B2) are called compatible ([ 18 ]) if 
they are over the same multi-set of elements of matroid, that is if 

Bi U B2 = B[ U B'2 and B\ n B2 = Bj D B2. 

In Theorem 1 we consider matroids, in which each set of compatible pairs of 
bases can be divided into disjoint subsets having a common strongly orderable 
bijection, up to transformation ( 1 ). If we consider a bases pairs graph, with the 
set of vertices consisting of ordered pairs of different bases and edges linking pairs 
(Bi,B 2) and 0x,y{B\,B2), the condition above is equivalent to the existence of 
family of disjoint hypercubes which covers the bases pairs graph. Each hypercube 
corresponds to a single strongly orderable bijection. In the sequel we will identify 
a strongly orderable bijection between a pair of bases with a matching in its 
internal exchange graph. 

Definition 2 . We say that IQm has exchange-preservation property if there 
exists a family of strongly orderable bijections ^3^,82 graphs XQ{Bi, B2), in- 
dexed by pairs of bases of M, satisfying for each such pair Bi, B2, 

for every edge (x, y) in ^Bi.Bj- ( 2 ) 

A matroid M is called exchange-preserving if IQm has exchange-preservation 
property. In particular, exchange-preserving matroid is necessarily strongly base 
orderable. The main result of this section gives some sufficient condition on a 
matroid to induce satisfactory expansion of its bases-exchange graph. 

Theorem 1 . If IQm has exchange-preservation property then Qm has cutset 
expansion at least 

Proof. Following a method introduced in [ 8 ], we construct a family of canonical 
paths r = { 7 x,y}x,y€B in Sm, where B denotes the set of bases oi M — 
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{E,I). Let {^x,y}x,Y£B be any fixed family of strongly orderable bijections 
satisfying (2) in Definition 2. We also fix some linear order <e in E. For arbitrary 
bases X and Y this order induces a linear ordering <x,y of edges connected 
by bijection ^x,Y in the following way: (x,y) <x,y {x' ,y') if max(x, y) <e 
max(x', y'), where x,x' € X \ Y and y,y' GY \X. Note that <x,y is preserved 
by transformations 0 x,y, that is up to transformation ( 1 ), <x,y = <©« y{x,Y)t 
for each (x, y) connected by 0 x,y- 

For every pair of bases X,Y G B, we define a path "fx,Y in Qm a se- 
quence Xo,Xi,... ,Xm, where m = |X \ y| = |F \ X\. Observe that each 
strongly orderable bijection X Y Ccin be equivalently defined as a function 
(X \ y) — (y \ X), since it is necessarily an identity on X n y. A path jx,Y 
is determined, together with an auxiliary sequence yo, . . . , Ym, by the following 
equations, where (xi,j/i) <x,y ••• (xm,ym) denote the edges of ^x,Y 

ordered by <x,Y- 

(Xo,Yo) = (X,Y), 

(Xi,Yi)^0x,,yA^i-i,yi-i) foTi = l...m. 

Obviously X^ = Y,Ym — X, moreover 7 x,y contains precisely m < |F^| edges. 
To prove that F is a family of canonical paths for which implies cutset 
expansion at least |, it is sufficient (cf. [16]) to bound the path congestion. For 
our purposes it is sufficient to show the following path congestion BOUND: 

for every edge (X',X") in the number of paths Jx,y containing this edge 
is not greater than |B|. 

For a fixed edge (X', X") in Qmi let path{X\ X") be the set of pairs (X, Y) such 
that the path 7x,y contains {X',X"). In order to show the PATH CONGESTION 
BOUND, we define a 1-1 mapping cr : path{X', X") — > B as follows: 

(t(x, y) = xAyAx' = (X n y) u ((XAy) \ x'), 

where A denotes a symmetric difference. It can be verified that if {X, Y) belongs 
to path{X',X"), it must satisfy: 

X'CXuy and X n y C X'. 

Hence, it follows easily that cr{X, Y) is a base. Moreover, if X 7 ^ X' (or equiva- 
lently <t(X, y) ^ y), then there exists 1 < i < m such that 

(X', <t(x, y)) = (. . . 0 ,, ,,, (X, y) . . . ). (4) 

For the PATH CONGESTION BOUND, it is enough to show that for every base 
y', there exists at most one pair (X, y) for which cr{X, Y) — Y'. Assuming that 
cr(X, y) = y', we will show that X and Y axe determined uniquely by Y'; to this 
end we will “reconstruct” X and Y from X', Y' and X". Put {x} := X' \ X", 
{j/} := X" \ X'. Let 



(yii^l) <X',Y' (2/2, 2:2) <X',Y' ■■■ <X',Y' 
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denote all the edges of #x',y' less then (x,p) in ordering <x',r'- The edge (x,y) 
belongs to ^x',y' by (4) because it belongs to ^x,Y- As (X', Y') is on the canon- 
ical path starting at X, there were made exactly j exchanges (xi,yi) . . . (xj,yj) 
on the way from X to X'. It means that X' = X \ {xi, . . . , xj} U {j/i, . . . , yj}, 
which uniquely determines X by 

X X \ {2/1, ... , U {xi, ... j x^}. 

Similarly, (4) allows us to “reconstruct” Y from Y' by 

y := y'\{xi,... ,Xj}U{yi,... ,yj}. 



□ 

We conclude this section by noting that requirements in Theorem 1 could be 
slightly relaxed. In fact, for the efficient counting (and generation) of the bases 
of a matroid it is sufficient to have cutset expansion at least inverse polynomial 
with respect to the size of a matroid (cf. [16]). As usual, we assume that the size 
of (a representation of) a matroid is of the same order as the size of its ground- 
set. To be able to formulate appropriate weaker assumptions, we generalize the 
exchange-preservation property introduced in Definition 2: 

Definition 3. We say that TQm has A-local exchange-preservation property if 
for each pair B\,B 2 of bases of M, there exists a family of strongly orderable 
bijections in graph lQ{Bi,B 2 ) of size < h, such that for each 

such pair B\,B 2 , 

Ox,y{-^) e for each tt € ^b^Bi and each edge (x,y) in tt. 

Observe that Definition 2 corresponds to 1-local exchange-preservation property. 
Note that we deliberately overload notation ^Bi,B 2 to denote single strongly or- 
derable bijections (in Definition 2) as well as sets of strongly orderable bijections 
(in Definition 3), as this should not le^ld to any confusion. Result from Theo- 
rem 1 can be easily extended to matroids satisfying fc-loceil exchange-preservation 
property for any bound k. 

Theorem 2. IflQj^ has k-local exchange-preservation property then Qm has 
cutset expansion at least 

As an easy conclusion from this theorem we obtain: 

Corollary 1. For matroids with polynomially bounded (w.r.t. the size of a ma- 
troid) local exchange-preservation property there exists an efficient algorithm to 
approximate |R|. 

4 Transversal matroids 

In this section we argue that the class of exchange preserving matroids is a rich 
subclass of strongly base orderable matroids, namely that it contains all binary 
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transversal matroids. At the beginning we recall the definition of the transversal 
matroid. We can associate a matroid with an arbitrary family 



■A — (Ai, . . . , -Afi) 



of subsets of a finite set E (groundset). A subset X — {xi, . . . , Xk} of distinct el- 
ements of is a partial transversal of A if there exists a subfamily ( Ajj , . . . , ) 

of A such that Xj € Aj^. (1 < j < k). {xj is Scdd to represent Aj^. in X.) The max- 
imal partial transversals are then the bases of a matroid T{E,A) on E, called 
a transversal matroid [5, 17]. We say that the collection A is a presentation of 
T{E,A) (it is not necessarily unique), for a given transversal matroid. 

Each presentation of a transverscil matroid can be associated in a natural 
way with a bipartite graph representation. One set of vertices in this bipartite 
graph consists of the underlying set of elements of the matroid, the other set is 
the family of sets of elements in the matroid presentation; an ‘element’ vertex is 
joined to a ‘set’ vertex if the element is contained in the set. Denote this graph 
as Q{E,A). 

Transversal matroids are strongly base orderable. We present here a 
proof of this fact different from that in [17]. We will construct a strongly orderable 
bijection tt : (Jf \ F) (F \ X) for arbitrary bases X and F of a transversal 
matroid M, starting from the bipartite graph representation associated to an 
arbitrary presentation A of Ad. We will apply this construction in Lemma 1, 
which states exchange-preservation property for binary subclass of transversal 
matroids. Consider two matchings in Q{E, A): mx between X and A, and my 
between F and A, representing X and F, respectively. Denote by X-,y the subset 
of elements in X matched (via mx) to some set A £ A, which is not matched 
(via my) to any element of F. Symmetrically, let Y-,x denote the subset of 
elements in F matched (via my) to some set A £ A, which is not matched (via 
mx) to any element of X. Notice that jA'-.y) = |F-,x|- Assume that such mx 
and my are chosen which satisfy: 

A..ynF = 0 and F-,xnA = 0 (5) 

that is no element from A fl F is in X_,y or in F-,x- If mx and my do not 
satisfy (5), then they can be modified appropriately according to the following 
procedure: 

Assume e £ X~,y fl F. We can modify my by matching e via my to 
the same set firom A, which is matched to e via mx- This will result 
in excluding e from X-,y, while not enlarging Y-.x, but possibly some 
other element ei £ X will be cidded to X~,y (namely the one matched 
to the same set, to which e was matched previously by my). Continuing 
in the same way with ci, we obtedn a sequence eo = e, ei,...e(. The 
crucial observations is that once e, was excluded from X-,y, it can never 
be added back, since it is matched via both mx and my to the same 
set. 
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When the procedure finishes, we have two possibilities: either after excluding ei 
no other element was added to X_,y (hence globally e was excluded from X-,y 
and by the way ej was excluded from V-,x)> or ej_i was replaced by ei ^ V 
(in this case the global modification of X~,y cimounts to replacing ei by ei and 
V-^x remains unchanged). In both cases, number of elements of X^y fl Y was 
decreased by excluding ei- Symmetrically we treat each e G Y~,x n Y. 

Observe that (5) implies immediately that 

and y-^x-0; (6) 

to show this, notice that whenever some e G X-,y then by (5) e ^ y, from which 
it follows that Y U {e} is a transversal, which is obviously false. 

Now we are ready to define n: for each x G (X \ y) \ X-^y, we define tt{x) G 
(y \ X) \ Y^x as the outcome of the following procedure: 

Let A\ G be a set matched to x and let y\ be an element of Y matched 
to Ai- {Ai is matched via my to some element of Y, since x ^ 

If 1/1 ^ X n y, put 7 t(x) := 2/1. Otherwise, let ^2 G ^ be a set matched 
to 2/1 via mx and let 2/2 be an element of Y matched to A 2 . Existence 
of such 2 / is guaranteed by (5). Continuing in this manner we eventually 
reach yi ^ X r\Y (since i/i»l/ 2 > • • • axe distinct elements of the finite set 
X n y) and put 7t(x) := yi. 

For an argument that tt is strongly orderable, take any x G (X \ y) \ Y-.x, and 
let 2 / 1 , . ■ • , J/i = tt{x) denote the sequence resulting from the procedure above. 
After exchanging x and 7r(x) = 2/«, matchings mx and my (on the left) are 
transformed to m^ and m'y (on the right): 



X - 


— ^ 


Vi 


yi 


— Ai — 


- X 


yi - 


— A 2 


2/2 


V2 


— A 2 — 


-yi 


Vi-i 


— Ai 


Vi 


yi 


Ai — 


yi-1 



Obviously, m^ and m'y demonstrate that sets X \ {x} U {7r(x)} and Y \ {7t(x)} U 
{x} are transversals. Moreover, exchanging x and 7r(x) doesn’t ehminate any 
other possible exchanges induced by ir (since sets of elements from X C\Y be- 
longing to sequences 2/1 > • • ■ j y» for different elements x E X are disjoint) which 
implies that tt is strongly orderable. O 

When transversals X and Y are total, each pair of matchings mx and my sat- 
isfies (6), hence induces some strongly orderable bijection. On the other hand, not 
each such bijection is induced (in all presentations) by some pair of matchings, 
even if X and Y are total. We will show that transversal binary matroids satisfy 
exchange-preservation property. The following fact, cited in [2], was proved by 
Edmonds. 

Fact 1. A transversal matroid is binary if and only if it has a presentation such 
that the associated bipartite graph is acyclic. 
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The interesting conclusion from this fact is: 

Lemma 1. Transversal binary matroids satisfy exchange-preservation property. 

Proof. The family of strong orderable bijections 0x,y is defined as follows: for 
each pair of bases (X, Y) we take the family of bijections induced by matchings 
satisfying (6) in the bipartite graph Q{E,A). We assume that this graph was 
chosen to be acyclic, according to F^ict 1. Assume that for some bases X, Y there 
are two different strong orderable bijections in ^x,Y y% and Xi -t 

both induced by a pair of matchings in Q(E,A); represents the alternating 
path in G{E,A). 

Xi s-> yi xi CH 2 /<t(i) (7) 

X2 X2 2/(t(2) 

Xk — — Vk Xk — y<T{k) 

We argue, that this situation implies the existence of a cycle in G(E, A^. Consider 
the following sequence of vertices: 



Xi 2/1 = J/ff(<r-i(l)) S-> X„- 1 ( 1 ) • • • 

where represents a path induced by one of pairs of matchings in (7). After a 
finite number of fr* steps we finally repeat one of the elements: 

xuyi,x„-i(i),y„-i(i),... 

twice. This gives a cycle in G{E,A), which contradicts Fact 1. □ 

In [17] it was proved that trauisversal bineiry matroids are graphic. Moreover, we 
have the following characterization: 

Theorem 3 ([2]). Let G be a finite graph. Then the cycle matroid of G is 
transversal if and only if G contains no subgraph homeomorfic to Ki or 
{k > 2 ). 

5 Conclusions and open problems 

We have introduced the class of exchange preserving matroids and shown that 
they have a cutset expemsion This immediately imphes the existence of ef- 
fective sampling and randomized approximation schemes for bases of matroids 
satisfying exchange-preservation property. We cdso proved that the class under 
consideration contains all binary transversal matroids. 

The most challenging open question and a natural continuation of this work 
is to try to enlarge this class to all transversal or strong base orderable matroids. 

We have also presented some preliminary results yielding the expansion for all 
strong base orderable matroids having polynomially bounded families of strongly 
orderable bijections preserved by single exchanges. 
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Further study is needed to find a characterization of the class of matroids 
satisfying polynomially bounded local exchange-preservation property, for which 
the expansion follows firom Theorem 2. 

The comparison of transversal matroids satisfying exchange-preservation prop- 
erty and strongly base orderable matroids satisfying this property with the class 
of balanced matroids, for which inverse polynomicdly bounded expansion was 
proved in [6], remains an interesting open question. 
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Abstract. We investigate the semantic foundations of a compositional 
proof method for concurrent systems communicating via synchronous 
message passing. Basing ourselves on the inductive assertion method 
for local verification of synchronous transition diagrams which are com- 
posed both sequentially and in p^lrallel, we present a compositional proof 
system that is proved sound and (semantically) complete. The mathe- 
matical foundations of this methodology consist of a purely semantic 
view of predicates as sets of states, the introduction of a virtual history 
variable into the proofs of basic components, a semantic approximation 
called involvement of the synt 2 kctic notion of occurrence of a variable or 
a channel within an assertion, and the use of projections as fundamental 
technique for formulating compositional proof rules. 



1 Introduction 

This paper focusses on the mathematical theory of state-based reasoning about 
concmrent program constructs solely through specifications of their parts, with- 
out any reliance on their implementation mechanism. That is, the semantic foun- 
dations of compositional state-based reasoning about concurrency. The main 
advantages of a pmrely semantic approach are that: 

- it highlights the very concept of compositional state-based reasoning about 
concurrency without any syntactic overhead (predicates and assertions are 
viewed semantically as sets of states), and 

— it serves as a basis for the encoding of the program semantics and correspond- 
ing proof rules inside tools such as PVS which support program verification. 

Referring to [7] for the full theory, the present paper illustrates the semantic 
approach for a particular case, namely that of concurrent systems based on syn- 
chronous communication, in which processes can be composed both sequentially 
and in parallel. 

Om: own motivation for developing this theory derives fi:om three sources: 
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1. The dramatic simplification which such a semantical theory represents over 
earlier, syntactically formulated, theories for the same concepts, such as, for 
instance, those of Job Zwiers [8]; this simplification was a prerequisite for 
writing [7]. 

2. The confrontation with the many tool-based theories for compositional rea- 
soning about concurrency, and their apphcations [3, 5, 6]; these made us won- 
der which compositional theories these authors actually implemented inside 
their tools. 

3. More generally, the relationship between operational and axiomatic seman- 
tics of programming languages, specifically, the construction of programming 
logics from a compositional semantics; this line of research was pioneered by 
Samson Abramsky [1]. 

The approach which is followed in this paper is based on the inductive as- 
sertion method [4] which is a methodology for proving state-based transition 
diagrams correct. It consists of the construction of an assertion network by as- 
sociating with each location of a transition diagram a (state) predicate and with 
each transition a verification condition on the predicates associated with the lo- 
cations involved; semantically, these predicates are viewed as sets of states. Thus 
it reduces a statement of correctness of a transition diagram, which consists of a 
finite number of locations, to a corresponding finite number of verification condi- 
tions on predicates. The inductive assertion method can be trivially generalized 
to concxirrent transition diagrams by viewing a concurrent transition diagram 
as the product of its components and thus reducing it to a sequential system. 
However this global proof method leeids to a number of verification conditions 
which is exponential in the number of components. 

Compositional proof methods in general provide a reduction in the com- 
plexity of the number of verification conditions. In this paper we investigate 
the semantic foundations of a compositional proof method for concurrent sys- 
tems obtained by sequential and pcirallel composition from sequential transition 
diagrams. The transition diagrams of a concurrent system communicate via syn- 
chronous message passing. 

Technically, we introduce the new concept of compositionally inductive as- 
sertion networks for reasoning about the sequential parts, i.e., the transition 
diagrams, of a conciurent system. By means of compositional proof rules such 
assertion networks can be combined for deducing properties of the whole system. 
The basic idea of a compositionally inductive assertion network is the definition 
of the verification conditions in terms of a logical history variable which records 
the sequence of communications generated by the component (logical variables 
only occiu in assertions, never in programs). The parallel combination of these 
compositionally inductive assertion networks is defined in terms of a simple se- 
mantic characterization of the variables and channels involved in a predicate. 
(The semantic notion “involved in” of a variable approximates the correspond- 
ing syntactic notion of occurrence.) More specifically, the notion of the channels 
involved in a predicate is defined in terms of a natiual generalization of a pro- 
jection operation on histories to predicates. 
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A complicating proof-theoretic consequence of this notion of parallel com- 
bination of compositionahy inductive assertion networks is that it requires the 
introduction in the proof method of the so-called prefix-invariance axiom (as 
initially formulated in [8]). This axiom basically expresses that the history of 
communications of a system grows monotonically. One of the main theoretical 
advantages of our approach is that it provides a simple semantic explanation of 
this prefix-invariance axiom. 

2 Concurrent Systems 

We consider parallel and sequentially composed systems the components of which 
communicate by means of synchronous message passing along unidirectional one- 
to-one channels. 

The basic construct of these systems is that of a synchronous transition di- 
agram, i.e., a labeled directed graph where each label denotes an instruction a. 
Instructions involve either a guarded state-transformation or a guarded commu- 
nication. Given an infinite set of variables VAR, with typiccil elements x,y,z,.. . , 
the set of states E, with typical element a, is given by VAR -> VAL, where VAL 
denotes the given underlying domain of values. Furthermore, let CHAN be a set 
of channel names, with typical elements c,d — For c 6 CHAN, e a semantic 
expression, i.e., e 6 iT -> VAL, execution of an output statement c!e has to wait 
for execution of a corresponding input statement clx, and, similarly, execution 
of an input statement has to wait for execution of a corresponding output state- 
ment. If there exists a computation in which both an input statement c?x and 
an output statement c!e are reached, this implies that communication can take 
place and the value of e in the current state is assigned to x. We often refer to 
an input or output statement as an i/o statement. In general an instruction a 
can have the following form: 

1. A boolean condition b € V{E) followed by a state transformation / £ 17 — > 
E, notation: b -¥ f. Transitions of this form axe called internal transitions. 

2. A guarded i/o-statement. There eire two possibihties: 

(a) A guarded output statement c!e, notation: b -> c!e (6 6 V{E), e 6 17 -> 
VAL). 

(b) A guarded input statement c?x, notation: b — > c?x (6 € V{E)). 

These transitions are called input cind output transitions. 

Some terminology: In the sequel sets of states often will be called predicates. 
We have the following semantic characterization of the variables involved in a 
predicate and a state transformation. 

Definition 1. A predicate </) eV{E) involves the variables x if 
— V(T, ct' £ E. cr{x) = cr'(x) =>■ (cr £ (^ O a' £ (/>) . 

This condition expresses that the outcome of <j> only depends on the variables x. 
Similarly, a function f E. E E involves the variables x if 
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- e E. a{x) = <j'(x) =» /(a)(x) = f{cr'){x) 

- 'ia e E,y ^x. f{a){y) = a{y) 

The first condition expresses that if two states cr and a' agree with respect to the 
variables x, then so do their images under f. The second condition expresses 
that any other variable is not changed by f. 

For X we will use the notation f{x) cind <f>{x) to indicate that / and (j) involves 
the variables x. We restrict oiurselves to functions / and predicates (j> for which 
there exists a finite set of variables which are involved by / and <f>. Since any 
intersection with a finite set can be reduced to a finite intersection, the smallest 
set of variables involved by / (respectively f>) is well-defined. From now on we 
will call this smallest set the set of variables involved by / (respectively <p), also 
denoted by VAR{f) (and VAR{(I))). We will use the phrase ‘the variable x occurs 
in the state transformation / (predicate (f>)’ for x 6 VAR(f) {x £ VAR(<^)). The 
definition of involvement of a variable in a predicate is extended in Def. 14 to 
involvement of a channel. 

Now we define formally our basic progrcim components which are called basic 
synchronous transition diagram. 

Definition 2. A basic synchronous transition diagram is a quadruple {L, T, s, t), 
where L is a finite set of locations I, T is a finite set of transitions {I, a, I') with 
a an instruction as discussed above, and s and t are the entry and exit locations, 
respectively. 

Definition 3. For a set B, with typical element B, of basic synchronous transi- 
tion diagrams, we define the set of concurrent systems, with typical element P, 
inductively as follows: P ::= B | Pi] P 2 | AllFz 

For P a concurrent system, we denote 

- by VAR{P) the variables occurring in the state transformations and boolean 
conditions occurring in the basic transition diagrams contained in P. 

- CHAN{P) the set of channel names occurring in the basic transition dia- 
grams of which P is composed. 

For concmrrent systems we adso use the phrase ‘the variable x occurs in P' 
for X £ VAR{P). We call synchronous transition diagreims Pi,... ,Pn disjoint 
if their associated sets of variables are mutually disjoint, and every channel 
occurring in Pi,... ,P„ is unidirectional amd connects at most two different 
diagrams. In the sequel we shall assume that only disjoint diagrams are composed 
in parallel. 

3 Compositional Semantics of Concurrent Systems 

In this section we give a compositional semantics for concurrent systems. First 
we define the semantics of basic synchronous transition diagrams. 
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Given a basic synchronous transition diagram P = (L, T, s, t) we define a 
labeled transition relation between configurations (l;a-), where I & L. The labels 
are sequences of communications, denoted by 0. A communication itself is a 
pair (c, u), with c a channel name and v € VAL. It indicates that the value v 
has been commimicated along channel c. We assmne the following operations on 
sequences: the append operation, denoted by •, cind the operation of concatena- 
tion, denoted by o. The projection operation is denoted by 4-, i.e. 6^0, with C 
a set of channels, denotes the subsequence of 6 consisting of those communica- 
tions involving a channel c E C. The empty sequence we denote by e. Given a 
concurrent system P we will often write 6 X P instead of 0 4- CHAN{P). 

In the definition of the labeled transition relation we will make use of the 
notion of a variant of a state. 

a .. . A nr j £ f / 1 r \ f if ^ V distinct. 
Definition 4. We define er\v/x}(y) = < ^ uthc iwi se 

Definition 5. Let P — (L,T,s,t) be a basic synchronous transition diagram. 

- In case of an internal transition l-^V & T, a — b-^f, we have 

{l\(j) ifoeb and where o' = f{cr). 

- In case of an output transition l-^l'£T, a = b-y c!e, we have, for 
V = e(a), 

(1;(t) (1';<t), if<r€b. 

- In case of an input transition I V &T, with a = b clx, we define, for 
an arbitrary value v € VAL, 

{1;(t) if a e b. 

Furthermore, we have the following rules for computing the reflexive, transitive 
closure: 



{I] o) {l\ a) and 






Using the above transition relation we can now define the semantics of a 
basic synchronous trcinsition diagram, in which the value received in an input 
transition is selected by local guessing. 



Definition 6. Let P = {T,L,s,t) be a basic synchronous transition diagram 
and I e L. We define Oi{P) = ] {s\<j) (i;cr')}. 



This definition forms the basis of our definition of a compositional semantics 
for concurrent systems which is defined inductively w.r.t. their structure. 

Definition 7. The initial/final state semantics 0{P) of a basic synchronous 
transition diagram P — {T,L,s,i) is simply defined as Ot{P). 



0{Pv,P2) = 

{{CT, (72, 6) I 3(Ji,ei,e2.{(7,(7i,ei) € 0{Px) A (cTi,<T2,^2) € 0(^2) A 0 = ^1 O ^ 2 }- 
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0{P^ II P 2 ) = {{a, a’, 9) I | P<) £ 0{Pi) ^6 ^9 i (P 1 HP 2 )}. 

Here is obtained from a' by assigning to the variables not belonging to pro- 
cess Pi their corresponding values in a (note that Pi changes only its own local 
variables). 

Observe that, due to the condition 9 = 9). (P 1 HP 2 ) in the definition of 
0 (Pi||P 2 ), O does not contain communications along chcmnels not occmrring in 
Pi IIP 2 . We paraphrase this property by saying that O defines a precise semantics. 

We observe that in the above definition the requirement that a local history 
of a basic transition diagram can be obtained as the projection of one global 
history 9 guarantees that an input on a channel indeed can be synchronized 
with a corresponding output. 

It is worthwhile to observe here that the above semantics describes a commu- 
nication mechanism in which every cormmmication involves one sender and one 
receiver only (as is the case in CSP, for example) under our basic assumption 
that channels are uni-directional and one-to-one. 

4 A Compositional Proof Method 

In this section we first introduce so-called compositionally inductive assertion 
networks for reasoning about the basic sequential components of a concurrent 
system. Then we give compositional proof rules for deducing properties of the 
whole system. The basic idea of a compositionally inductive assertion network 
is the definition of the verification conditions in terms of a logical history vari- 
able which records the sequence of communications generated by the component. 
The parallel combination of these compositionally inductive assertion networks 
is defined in terms of a simple semantic characterization of the variables and 
channels involved in a predicate. The notion of the channels involved in a pred- 
icate is defined in terms of a natmal generaUzation of the projection operation 
on histories to predicates. 

We assume given a set of history variables HVAR C VAR and a distinguished 
history variable h € HVAR. A state a & E thus assigns to each history variable a 
sequence of communications (and to eeich other variable an element of the given 
domain VAL). The distinguished history variable h represents the sequence of 
communications of the given concurrent system. For every basic synchronous 
transition diagram P we require that every state transformation / and boolean 
condition 6 of P satisfies that VAR{f) C ( VAR \ HVAR) and VAR{b) C ( VAR \ 
HVAR), i.e., we require that history variables do not occur in any program. 

In order to reason about an input statement c?x which involves the assign- 
ment of an arbitrary value to x, we need the introduction of quantifiers involving 
variables of the set ( VAR \ HVAR). We define for a predicate (j>, cr £ 3x.<p iff 
there exists v € VAL such that (t{v/x} (j>. 

Definition 8. An assertion network # for a synchronous transition diagram 
P = {L, T, s, t) assigns to each I G L a predicate . 
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We have the following definition of a compositionally inductive assertion net- 
work. In this definition f{<j>) denotes the image of predicate (f> under the state 
transformation /, i.e., the set of states {/{<r) | a G (j)}. 

Definition 9. A load assertion network ^ for a synchronous transition diagram 
P is called compositionally inductive if: 

— For I A- I' a local transition of P, i.e., a = b f for some boolean b and 
state-transformation f one has f(^i nb) C 

— For I A V an output transition of P, i.e. a — b c\e, for some boolean b 
and channel c, one has g{^i PI 6) C ^j/, where g{a) = cr{a{h) ■ (c, e{a))/h}. 

— For I A I' an input transition of P, i.e. a — b^ clx, for some boolean band 
channel c, one has g{3x.^i O 6) C where g{a) = cr{cr(/i) ■ {c,(r{x))/h}. 

We denote by P I- ^ that # is a compositionally inductive assertion network 
for P. 

Definition 10. A partial correctness statement is of the form {(j>}P{il>}, where 
(f) and Ip are predicates, also called the precondition and postcondition, and P is 
a concurrent system. 

Formally, validity of a partial correctness statement {<p}P{xp} is defined with 
respect to the semantics O. 

Definition 11. We define ^ {<p}P{ip} as follows: 

for every {a, a', 6) € 0{P) such that a G (p, we have a'{a{h) oO/h} G tp. 

Definition 12. For P = (L,T,s,t) a basic synchronous transition diagram, we 
have the following transformation rule: 

Ph ^ 

¥s}pm 

Definition 13. For sequentially composed P = PiiPz we have the following 
rule: 



{OP2W 

{<P}PW 

In order to define a rule for parallel composition we introduce the following 
restriction operation on predicates. 

Definition 14. Let (p be a predicate and C a set of channels. We denote by 
(p \.C the predicate 

{cr I 3a' G <p s.t. a{x) = a'{x),{oi x G VAR \ {/i}, and a{h) J. C = cr'{h) i C}. 

Note that ^ C = <p indicates that, as far as the dependency of the value of <p 
upon the value of h is concerned, the value of (p only depends on the projection 
of the global history h on the channels C. More formally, <p C — <p indicates 
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that for every cr and a', such that a and a' are the same but for the value of the 
history variable h and a{h) \.C = <j'{h) C, we have 

cr \= (j) if and only if a' |= <j>. 

If (f> iC = <j) then we also say that ‘(^ only involves the channels of C\ 

We can now formulate the following rule for parallel composition: 

Definition 15. Let P = Pi || P 2 in the rule 

{4>i fi<l>2}P{‘ipi<^'ip2} 

provided -ipi does not involve the variables of Pj and neither those channels of 
Pj which don’t occur in Pi, i.e., '>pi i C, — 'ipi, for some set of channels Ci such 
that Ci n {CHAN(Pj) \ CHAN (Pi)) = 0, i j. 

Note that the restriction on channels indeed is necessary: Consider for ex- 
ample a network c!0 || d!0 (abstracting from the locations of the components). 
Locally, we can prove 

{/i = e}c!0{/i = ((c, 0))} zmd {h = e}dl0{h — ((d, 0))} 

(here h = e, for example, denotes the predicate {a \ a{h) = e}). Applying the 
above rule leads to {h = e}c!0 || dl0{false}. 

However, this gives rise to incorrect results when further composing the sys- 
tem c!0 II d!0. Observe that, e.g., postcondition h = ((c, 0)) also involves channel 
d, in fact, we have that h = ((c,0)) involves all channels, and, hence, the con- 
dition upon the postconditions in the above rule for parallel composition are 
violated. 

Still we cannot derive the vahd correctness statement 

{(d, c) = h}d\ II d{(d, c) < h} 

(we abstract both from locations and the values sent) which tells us that if the 
past history consists of first a communication on d followed by one on c, then 
after the communications d! and c! one has that (d, c) is a prefix (the prefix 
relation on sequences is denoted by <) of the new history h. It is not difiicult to 
see that we cannot derive this correctness statement because of the restrictions 
on the postconditions in the rule for parallel composition, namely that they 
should only involve the channels of the components they describe (and channels 
outside of the particular parallel composition concerned). 

In order to derive this correctness statement we have to introduce the fol- 
lowing prefix-invariance axiom: 

Definition 16. 

{t = h}P{t < h). 

Additionally we have the usual consequence rule, a conjunction rule and 
a suitable substitution rule for eUminating the logical variable t of the prefix 
invariance axiom. 

Derivability of a partial correctness statement {4>}P{ip} using the rules above, 
with P a concurrent system, we denote by I- {cj)}P{ip}. 
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5 Soundness 

A proof of the soundness of the method presented in the previous section pro- 
ceeds by induction on the structmre of P. Here we treat only the basic case, 
detailed soundness proofs for the remaining rules are given in [7] 

Theorem 1. Let P = (L,T,s,t) be a basic synchronous transition diagram and 
^ be a compositionally inductive assertion network for P. l^e have that P \- 0 
implies [= {^s}P{^t}- 

Proof 

It suffices to prove that for every computation (s; a) — y {1; a') of P == (L, T, s, t), 
we have that cr 6 implies oO/h} € 

We proceed by induction on the length of the computation. We treat the case 
that the last transition involves the execution of a guarded input b — t clx. Let 

{s\a) and {l;a'{v/x}), 

with cr' E b and ct 6 By the induction hypothesis we have that cr'{cr(h) o 
6/h} E We are given that ^ is compositionally inductive, so we have that 

g{3x.^i' n6) C <?i, 

where p(<t) = (t{<7(/i)'(c,(7(x))//i}, for every state cr. Now cr'{(7(/i)o0//i} E ^I'Db 
(note that h ^ VAP(6)), so a'{v/x}{a{h) o 6/h} E 3x.^i> H b. Consequently, 
g{(x'{v/x}{<r{h) o 6/h}) = a' {v / x}{(r{h) o6 • {c, v)/h} E g{3x.^i' Pi 6) C 

6 Semantic Completeness 

We want to prove completeness, that is, we want to prove that every valid partial 
correctness statement of a system P is derivable, i.e.. 

Theorem 2. {0} P {0} ^ \- {0} P {0}. 

We prove this by induction on the structme of P, restricting ourselves to the 
cases of basic transition diagrams and parallel composition. To this end we in- 
troduce the following strongest postcondition semantics. 

Definition 17. Given a basic synchronous transition diagram P = {L,T,s,t), 
I E L, and a precondition 0 we define SPi{(j>,P) = 

{(j{6' /h} I there exists cr' s.t. a' E 0, (cr',<T,0) E Oi{P), and 6' = a'{h) o 6}. 

By SP{4>,P), for P = {L,T,s,t), we denote SPt{^,P). Similarly we define 
SP{<j),P), for P a concurrent system, in terms of 0{P). 

It is easy to see that |= {0}P{5P((0, P)}, and that SPt{(j>, P) C ip is impUed 
by the validity of {0}P{0}, for P a synchronous transition diagram. Similarly, 
for P a concmrent system, we have that |= {0}P{5P(0, P)} and that the validity 
of {0}P{0} implies 5P(0, P) C 0. We now prove oin: completeness result for 
basic synchronous transition diagrams, i.e., 
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For P a basic synchronous transition diagram, we have h {(j)}P{SP{4>,P)}. 

Proof 

Let P — {L, T, s, t) and let 0 be the assertion network which associates with each 
location I of P the predicate SPi{<j>,P). It is easy to prove that this assertion 
network is compositionally inductive. We treat the case of an input transition 
Z A i' e T, with a = 6 — > c?x. We have to prove that 

g{3x.{SPi{^, P)r\b)) C SPi>{<l),P), where ^(cr) = a{a{h) ■ {c,cr{x))/h}, for every 
state a. So let cr G g{3x.SPi{(j),P) n b), then a — (r'{v/x}{a'{h} ■ {c,v)/h}, 
where a' € SPi {<f>, P) fl b, for some vcdue v. Prom a' 6 SPi {<j>, P) fl 6 we derive 
the existence of a computation (s;o-o) — >■ where (Tq € (j> and a' = 

<r"{o’o{h) o 0/h}. By Def. 5 it follows that (Z;^') {I' ; a' {v / x}) . We conclude 

that a = a'{v/x}{(r'{h) ■ {c,v)/h} e SPi'{(f>,P). 

Coming back to oiu: main argument, we derive by Rule 12 that 

h{5P,(0,P)}P{5Pt(0,P).} 

Next we observe that <t> C SPs{<j>,P), so an application of the consequence rule 
gives us h {<f)}P{SP{<f),P)}. 



Now we prove 

h{^}P{5P(<?i,P)},forP = Pi II P 2 , 

by induction on the complexity of P. We first introduce the following sets of 
variables and channels. 

- x = VAR{<I),P), 

- Xi= VAR{Pi), i = 1,2, 

- Ci = CHAN \ (CHAN{Pj) \ CHAN{Pi)), i ^ j. 

By the induction hypothesis we have h {<f)'}Pi{SP {(!>', Pi)}, Z = 1, 2, 
where 4>' — {<j}r\z = x<lt — h), with t a fresh” history variable and z a sequence 
of ‘fresh’ variables corresponding to x (z = x denotes the set of states a such 
that a(zi) = <r(xi), for every Zi € z and corresponding Xj). The variables t and 
z are used to freeze the initial values of x and h; the predicate z = x initializes 
the variables of z to the values of the corresponding x. Similarly, t = h sets the 
value of t to the initial value of h. 

Applying the consequence rule we obtain 

b {4>'}Pi{3xj{SP{ct>’,Pi)iCi)}, i,j e {1,2}, i^j, 

using that 5P(^', Pi) C 3xj{SP{<f>', Pi) I Ci). In the sequel we write 3xjSP{(j>', Pi) I 
Ci (dropping the parentheses). This is justified because 3xj{SP{(f>',Pi) } C,) = 
{3xjSP{(f>', Pi)) I Ci- Clearly the conditions of the parallel composition rule are 
satisfied. So we obtain h {^'}P{3x2SP{<p' , Pi) | Ci A 3xiSP{4>' , P^) i C 2 }. 

Next we apply the conjunction rule to the above and the prefix invariance axiom, 
obtaining \- {(f' C\t = h}P{3x2SP{(f>', Pi) | C7i A 3xiSP{<f>', P 2 ) 4- C 2 n Z < h}. 
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(Note that actually t = h is already implied by (p'.) To proceed we need the 
following main lemma claiming that this postcondition indeed characterizes the 
strongest postcondition (For the technical details of its proof see [7]). 

Lemma 1. {3x2SP{<p', Pi) i Ci n 3xiSP{(l>', P 2 ) i Cj n t < h) C SP{(p', P). 

Applying the consequence rule we thus obtain h {(p'}P{SP{(j>' , P)}. 

Since SP{4>' ,P) C SP{4>,P) {<j>' C ^ and 5P is monotone in its first argument), 
we derive by another application of the consequence rule 

h {<p'}P{SPi4>,P)}. 

An application of the elimination rule then gives us P {3z, t.<j)'}P{SP{<j>,P)}. 
Finally, we observe that <j> = 3z, firom which the result follows. 

7 Future Work 

The present paper is the second in a series of papers on the semantic foundations 
of compositional reasoning. The first paper in this series [2] discusses the gen- 
eral philosophy behind this approach, and explains the semantic foundations of 
compositional reasoning about top-level synchronous networks. The present one 
extends this theory to nested concurrency, in which synchronous networks can 
be composed both sequentially and in parallel. This extension requires the intro- 
duction in the proof method of the prefix-invariance axiom, a purely semantic 
accoimt of which forms one of the main challenges dealt with in this paper. The 
third paper will discuss the semantic foundations of rely-guarantee-based rea- 
soning for shared- variable concmrency. Taken together, these papers constitute 
the basis for a chapter on compositional reasoning in [7]. 
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Abstract. We investigate the difference between two well-known no- 
tions of independence bisimileirity, history-preserving bisimulation and 
hereditary history-preserving bisimulation. We characterise the difference 
between the two bisimulations in trace-theoretical terms, advocating the 
view that the first is (just) a bisimulation for causality, while the second 
is a bisimulation for concurrency. We explore the frontier zone between 
the two notions by defining a hierarchy of bounded backtracking bisim- 
ulations. Our goal is to provide a stepping stone for the solution to the 
intriguing open problem of whether hereditary history-preserving bisim- 
ulation is decidable or not. We prove that each of the bounded bisimu- 
lations is decidable. However, we also prove that the hierarchy is strict. 
This rules out the possibility that decidability of the general problem 
follows directly from the special case. Finally, we give a non trivial re- 
duction solving the general problem for a restricted class of systems and 
give pointers towards a full answer. 



1 Introduction 

The classical bisimulation of Park and Milner applies to a setting in which con- 
cmrrency is reduced to non- deterministic interleaving of actions. However, for 
some situations, a more detailed description of the causal ordering between ac- 
tions is needed [17, 14], which calls for models that do not abstract from concur- 
rency, commonly referred to as independence-, partial order- or true concurrency- 
models. 

Many attempts have been made to answer the question what the appropriate 
generalisation of the interleaving bisimulation to independence models is. Two 
interesting bisimulations for independence models are history-preserving bisimu- 
lation (HPB) and hereditary history-preserving bisimulation (HHPB). HPB was 
introduced in [13] and [5] under the name of behaviour structure bisimulation, 
and mixed ordering (mo) bisimulation respectively. The term history-preserving 
originates from [14], where Goltz and vanGlabbeek define the notion for event 
structures and prove the key property of HPB, namely that it is preserved un- 
der action refinement. This result has given HPB its prominent place among 
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independence bisimulations. In [2] the notion is introduced as fully concurrent 
bisimulation. There it is independently shown that HPB preserves action refine- 
ment for the more general model of Petri nets. 

The notion of HHPB first appems in [1], where Bednarczyk studies several 
history-preserving bisimulations with a downwards closure condition. He calls 
sets that satisfy this condition hereditary. The paper [9] describes a uniform way 
of defining an abstract bisimulation equivalence across a wide range of different 
models by applying category-theoretical ideas. For many concrete models, the 
abstract bisimulation specializes to already known equivalences [4]. In particular, 
one gets classical bisimulation for standard transition systems. For independence 
models, the abstract notion of bisimulation specializes to HHPB (which is called 
strong HPB in [9]) suggesting that HHPB is a very natural independence bisim- 
ulation. This is fmther confirmed by the results of [12]. 

Altogether a fair amount of work has been done already in studying both, 
HPB and HHPB. However, only few attempts [15] have been made to demar- 
cate the two notions from each other. Moreover, an intriguing question remains 
unsolved: Is HHPB decidable for a reasonable class of systems? In contrast, 
HPB has been shown to be decidable for finite 1-safe Petri nets by Vogler [16], 
DEXPTIME-complete by Jategaonkar and Meyer [8] and decidable for n-safe 
nets by Montanari and Pistore [10]. But there is no straightforward adaption of 
these proofs to HHPB, and it seems that the hereditary condition brings about 
new dimensions. This justifies a deeper investigation of the difference between 
plain and hereditary HPB, which is the goal of this paper. 

One statement we want to put forward is that hereditary HPB is a bisimu- 
lation for concurrency as opposed to plaiin HPB (just) being a bisimulation for 
causality. Intuitively, HPB is an equivalence notion that relates systems with 




Fig. 1. The nets N and N' are HP bisimilar but not HHP bisimilax. 



the same causal branching structure. It extends the classical notion of bisim- 
ulation with the requirement, that any two related runs must have the same 
causal dependency between actions. HHPB eidditionally imposes a backtracking 
condition: for any two related runs, the runs obtained by backtracking a pair of 
related transitions, must be related, too. We allow backtracking not only in the 
order which is laid down by the related nms; as long as no other transitions 
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depend on a particular transition, it can be backtracked. Thereby it is ensured 
that the matching is not dependent on the order in which independent actions 
are linearized. 

Fig. 1 shows the standard example from [12] of two systems that are plain but 
not hereditary HP bisimilar. The transitions are labelled by the actions {o, b, c, d} 
as the names suggest, e.g. oi is labelled by a. In any HPB, the matching of the 
parallel a and 6-action depends on the order in which they appear in the runs 
to match. Note that the c transition dictates that we have to match oi to a^, 
and so ai.6i to a'l-bi- Then the backtracking condition requires that 6i and 6^ 
are related. But from this point, the system N' can make a d transition which 
N cannot match, so the two systems are not HHP bisimilar. 

After stating the necessary definitions in Sec. 2, we present a trace-theoretical 
characterisation of the difference between HHPB and HPB in Sec. 3. This will 
confirm our view of HHPB as a bisimulation for concurrency as opposed to HPB 
as a bisimulation for causahty. In Sec. 4, we consider the effect of restricting 
HHPB, by bounding how far back in two related runs one can pick transitions 
to backtrack. Remarkably, we prove in Sec. 4.1 that for a fixed bound, each such 
bisimulation is decidable. However, in Sec. 4.2 we find that the bounded bisim- 
ulations form a strict hierarchy, all trivially stronger than HPB but also strictly 
weaker than HHPB. In Sec. 5 we apply our results to approach the decidability 
of HHPB (for finite-state systems). After noting that decidability follows almost 
immediately for the class of boimded asynchronous nets, we present a non-trivieil 
reduction showing that HHPB is decidable for systems with transitive indepen- 
dence relation. In the end, we remark on other partial results and give directions 
for further progress. 

Let us note that one can also consider hidden actions in the context of HPB 
and HHPB. The weak version of HPB has been proved to be decidable in [8] 
and [18]. Here we will restrict om attention to (hereditary) HPB without hidden 
actions. As our model of computation we choose 1-safe Petri nets. However, 
e.g. by using the results of [19], our results can equally be formulated for other 
suitable independence models. Some proofs are left out due to the limitations of 
an extended abstract, for more details see [7]. 

2 Preliminaries 

The following definitions are stcindard and/or can be found in [8], [11], or [16], 
perhaps in a slightly varied form. 

Petri nets. A labelled Petri net N is a tuple (5 at, Tn, Fn, init^jN), where Sn 
is the set of places, Tjv is the set of treinsitions, : {Sn xTn)U {Tn x Sn) 

{0, 1} is the flow relation, initN ■ Sn l^o is the initial marking, and Zjv : Tn 
A ct is the labelling function, where Act is a set of actions. A net N is finite iff 
Sn and Tn are finite sets. The pre-set of an element x G Sn^Tn, *x, is defined 
Ly {y I Fn{v,x) > 0}, the post-set of x, x*, similarly is {y \ FN{x,y) > 0}. A 
marking M of N is a map Sn We say M enables a transition t GTn if 

M{s) > F{s, t) for every s E Sn- fit is enabled at M it can occur. The resulting 
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marking M' is defined by M'{s) = M{s) — F{s,t) + F{t,s) for all s € 5jv- We 
denote this by M A M'. We say that u; = ti . . is a transition-sequence of 
N. We write |tu| for the length of w, that is |n;| = n. If M ^ M' we use 
M ^ M' as short notation. For any transition t we write w.t for the sequence 
ti . . . tnt. A net N is 1-safe if for every marking M that is reachable from initjv, 
we have: M(s) < 1 for every s € Sff. We will always refer to this net class 
whenever we speak of ‘nets’ or ‘Petri nets’ in the following. 

Runs. A run of a net AT is a possibly empty transition-sequence r such that 
initff A M' for some M'. Let Runs(N) denote the set of all runs of a net N. 
When we have r € Runs{N), t £ Tjv, and two markings M, M', such that 
initN A M and M A M', then we write r A r.t. 

Independence of Transitions. We say two transitions t and t' of a net N are 
independent in N, denoted by t In t', iff their neighbourhoods of places do not 
intersect, i. e. iff (*t U t*) fl (*t' U t'*) = 0. 

Pomsets. A pomset is a labelled partial order.^ It is a tuple p — {Ep, <p, Lp, Ip), 
where Ep is a set of events, <p a partial order relation on Ep, Lp is a set of 
labels, and lp a labelhng function lp : Ep -¥ Lp. A function / is an isomorphism 
between pomset p and pomset q iS f : Ep Eg is & bijection, such that we have 
Ip = IqO f, and e <p e' iff /(e) <q f(e') for all e, e' £ Ep. 

Transition-pomsets. The transition-pomset of a run r = <i . . . t„, denoted by 
trPom{r), has as events the integers from 1 to n, where the label of event i is 
ti and the partial ordering is the transitive closure of the following “proximate 
cause” relation: event i proximately causes event / iff i < j and ti and tj are not 
independent in N. The pomset of r, denoted by pom{r), is the transition-pomset 
of r, where the label of each event i is I n(U), the label of ti, rather than ti itself. 

Trace Theory. A trace alphabet is a pair {E,I), where the alphabet 27 is a finite 
set, and / C 27 x 27 is an irreflexive cmd symmetric independence relation. Let 
27* be the set of finite words over 27, and let r, r' range over 27* . For T C E, let 
r'[T denote the projection of r onto T, i. e. the sequence obtained by erasing all 
occurrences of letters which are not in T. The independence relation I induces a 
relation C E* x 27* defined by r ~/ r' iff rt{a,6}= r't{a, 6}for all a,b £ E 
such that -i(a I b). Clearly, ~/ is an equivalence relation. The equivalence 
classes are usually referred to as (Mazurkiewicz’s) traces. For r £ 27*, [r] stands 
for the trace containing r. 27*/ represents the set of all traces over {E,I). 

Petri nets and Trace Theory. We C 2 m associate the trace alphabet (En,In) 
to a Petri net N, where 27 jv = Av, and In is as defined above. Transition-pomsets 
of a net N correspond one-to-one to traces in Runs (AT) /~/^ C E’^/^i^^. A 
transition-pomset p of N corresponds to the trace {r | r is a linearization of p}, 
and a trace tr £ Runs{N)/ corresponds to trPom{r), where r is any repre- 
sentative of tr. 



' This is not the original definition, but the convention used in [8]. 
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3 (Hereditary) History-Preserving Bisimulation and 
Trace Theory 

We are now ready for the two notions which are central to this paper, HPB 
and HHPB. Originally, these bisimulations have been defined on structures that 
represent the partial order explicitly. By employing the notion of synchronous 
runs from [ 8 ], and the notion of backwards enabled transitions introduced in [ 12 ] 
we can define (hereditary) HPB on runs, instead. This gives a characterization 
closely related to work in [ 5 ] and [ 12 ]. 

Definition 1 . Let ri and r2 be runs of nets Ni and N2, respectively. We say 
that ri and r2 are synchronous iff the identity function on { 1 , 2 , . . . , jrij} is an 
isomorphism between the pomset ofr\ and the pomset of r2- 

Definition 2 . Let N be a net, and the associated trace alphabet. Let 

r = t\ . . .tn G Runs{N). For t G En, we say t is backwards enabled in r, written 
t G BEn{r), iff there isi G { 1 , . . . , n} s. t. U — t, and Vj £ {i + 1 , . . . , n}. tj /jv U 
This means that i is a maximal element in pom(r). If t G BEn{r) we define 
S{r,t) to be the result of deleting the last occurrence oft in r, i. e. S{r,t) = 
ti . . . ti-iU+i ...tn iff last{r, t) — i, where last{r, t) denotes the position of the 
last occurrence of t in r. That is last{r,t) = i iff ti = t and tj ^ t for all 
j £ {i + 1 , . . . ,n}. 

Definition 3 . A HPB between two nets Ni and N2 consists of a set H C 
Runs{Ni) X Runs{N2) of pairs (ri,r2) such that 

(i) Whenever (r\,r2) G R, then ri and T2 are synchronous. 

(ii) (e,e) £ H. 

(Hi) Whenever (rj,r2) GR and ri ^ ri.ti for some ti, then there exists t2, such 

that t’2 r2.t2 and (ri.ti,r2.t2) G R. 

(iv) Vice versa. 

A HPB is hereditary when it further satisfies 

(v) Whenever (ri,r2) £ R and t\ G BEn{ri) and t2 G BEn{r2) for some ti, t2 
such that last{ri,ti) = Iast(r2,t2), then ( 5 (ri, ti), 5 (r 2 , <2)) € R- 

We say two nets are (hereditary) HP bisimilar iff there is a (hereditary) HPB 
relating them. 

It is trivial that one can regard a relation R C {(ri,r2) € X I 
ki| = kzj} as a language over the alphabet Tjv^ x Tatj, and vice versa. With 
this in mind, we can regard a (hereditary) HPB R as a, language over the trace 
alphabet Tni,N2- We define Tni,N2 as Tni,N2 = {L!,I), where E — Tn^ x Tjvj, 
and I is defined as (ti,t2) I (^1,^2) ^ *2 In2 *2- 

We will now characterize the difference between HPB and HHPB in trace- 
theoretical terms. For this we consider two properties of languages. 
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Definition 4. We say a language L C S* is prefix-closed iff r.t € L implies 
r e L. 

We say L is trace-consistent w. r. t. an independence relation I on E iff 
r r' & L implies r € L. For L C E*, let L^j denote the smallest trace 
language including L, i. e. = {r € 17* | 3r' G L. r' r}. 

By definition every HHPB is prefix-closed. This does not generally apply for 
HPBs. But as prefix-closed HPBs correspond to bisimulations that have been 
built up inductively from (e, e) without adding “any redundant tuples” , we can 
extract from any given HPB one that is prefix-closed. 

Proposition 1. Two nets are (hereditary) HP bisimilar iff there exists a prefix- 
closed (hereditary) HPB language relating them. 

A HPB language H is not necessarily trace-consistent, neither is a HHPB. 
But this can always be obtained. 

Observation 1. Let Hhe & (hereditary) HPB language between two nets A^i 
and N- 2 - Let Tni,N 2 — is a (hereditary) HPB too. 

Prop. 1 ensures, that it is safe to consider only prefix-closed HPBs. Note that 
if this property is fixed, an analogue to Obs. 1 is no longer possible. In general, 
if H is a prefix-closed HPB, is not necessarily prefix-closed. However, if H. 
is hereditary, this will still be true. 

Interestingly, if a prefix-closed HPB is also trace-consistent, it is in fact hered- 
itary. So, if one takes as part of the definition that a HPB is prefix-closed, one 
can regard hereditary HPBs as the class of trace-consistent HPBs. 

Proposition 2. Two nets are hereditary HP bisimilar iff there exists a trace- 
consistent prefix-closed HPB relating them. 

Remeirk: Conversely, from Obs. 1 it follows that one could teike as part of 
the definition that a HPB is trace-consistent. Then HHPBs become the class 
of prefix-closed HPBs (which corresponds to the approach taken in the original 
definition). We find the view we have put forward more natural. Taking trace- 
consistency as part of the definition disguises how the linearized runs of the two 
systems axe matched to each other. Since in HPBs the matching can be depen- 
dent on the order in which independent actions are linearized, this is information 
we do not want to hide away in a HPB. In contrast, the interpretation of HHPBs 
as the class of (prefix-closed) HPBs that eire trace languages expresses nicely that 
in HHPB the matching does not depend on the linearizations. 

4 History-Preserving Bisimulation and Bounded 
Backtracking 

We define a hierarchy of backtracking bisimulations by bounding the number of 
transitions which one can backtrack over to an arbitrary number n. 
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Definition 5. A HPB H is (n)-hereditary when it further satisfies 

(v) Whenever {ri,r2) € Ti and ti G BEn{r{) and t2 G BEn{r2) for some t\, t2 
such that last{ri,ti) — Iast(r2,t2) > |rij - n, then { 5 {ri,ti), 5 {r 2 ,t 2 )) € U. 

Note that (O)-hereditaxy HPBs are exactly the prefix-closed HPBs. 

It is easy to give a dynamic condition on nets, which guarantees that (n)- 
hereditary HP bisimilarity coincides with hereditary HP bisimilarity. 

Definition 6. Let N be a net. We say that N is (n)-bounded asynchronous if 
for any r — t\t2 ■■ - tk 6 Runs{N) such that U G BEn{r), it holds that k — i<n. 

Proposition 3. Let N and N' be two (n)-bounded asynchronous nets. Then N 
and N' are hereditary HP bisimilar iff N and N' are (n) -hereditary HP bisimilar. 



4.1 Decidability of (n)-Hereditary History-Preserving Bisimilarity 

For any fixed n, (n)-HHP bisimilarity is decidable for finite systems. The idea 
behind our proof is that we can define HHPB and (n)-HHPB in a ‘forward fash- 
ion’. At each tuple we keep a matching directive that prescribes how transitions 
are going to be matched from this point onwards. The matching directive al- 
lows us to express the backtracking requirement as a property of the matching 
directives of two connected tuples. 

To characterize HHPB in this manner we need to record the matching of 
the entire future. Because of this the forwards characterization merely shifts the 
difficulty of the decidability of HHPB from the past to the future: now we are 
confronted with an infinite amount of possible futures. This is not the case for 
(n)-HHPB. But we shall see that it is sufficient to record future matchings of 
length n. Our proof builds on this fact and insights gained in the proofs of the 
decidability of HPB [16,8]. 

Below is the definition of (n)-D HPB, our forwards characterization of (n)- 
HHPB. 

Convention. For a pair of synchronous runs (ri,r2) of two nets Ni and JV2, we use r 
as a short notation. Similarly, we write t for a pair of transitions (fi,t2) when h and 
<2 correspond to each other in a pair of synchronous runs (ri, r2). We also write r r' 
when we have two pairs of synchronous runs (ri, T2), (r[, r^), and a pair of transitions 
(ti,t2), such that ri % r[ and r2 r^. 

Definition 7. A (n)-D HPB between two nets Ni and N2 consists of a setHo 
of triples {ri,r2,D) such that 

(i) ri is a run of N\, r2 is a run of N2, and r\ and r2 are synchronous. The 
matching directive D is a non-empty and prefix-closed set of pairs of words 
(wi,W2), such that wi is a transition-sequence of Ni, W2 of N2 respectively, 
and jiuij = |u;2| < n. 

(a) For some D, {e,e,D) E.'Hd- 
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(Hi) Whenever (ri,r 2 ,£>) G Hd, o-nd w & D for some w, such that |w| < n, and 
for some ti, ri.wi ^ ri.wi.ti, then there is some t 2 such that {wi.ti,W 2 -t 2 ) G 
D. 

Note that (£,e) G D because D is prefix-closed and non-empty. 

(iv) Vice versa. 

(v) Whenever {ri,r 2 ,D) G o,nd {t\,t 2 ) G D, then there is some D', such 
that {ri.ti,r 2 -t 2 ,D') GHd <ind 

(a) Vw s. t. \w\ < n. tw £ D w G D' . 

(b) Vw'. w' £ D' At 1 1' for all t' £w' w' £ D. 

We now prove that (n)-D HPB is indeed equivalent to (n)-HHPB. As in Sec. 3 
it is sufficient to consider only prefix-closed (n)-D HPBs since they correspond to 
bisimulations that are built up inductively from the empty runs without adding 
any “redxmdant tuples”. 

Definition 8. We say a (n)-D HPB Hd is prefix-closed iff whenever (ri.ti, r 2 .t 2 : 
D') £ Hd, then there is {ri,r 2 ,D) £ Hd for some D such that t £ D and 

1. 'iw s. t. \w\ < n. tw £ D w £ D' . 

2. 'iw'. w' £ D' At 1 1' for all t' £w' ^ w' £ D. 

Lemma 1. Two nets are (n) -hereditary HP bisimilar iff they are (n)-D HP 
bisimilar. 

Proof. For one direction let Hhea (n)-HHPB relating Ni and N 2 . It is also safe 
to assume prefix-closure of H. We define Hd hy assigning a matching directive 
D to every pair (ri,r 2 ). We take Z? = {«; | |u;| < n A r.w £ H}. Prefix-closure 
of D is given by prefix-closure of H, hence property (i) of definition 7 clearly 
holds. Properties (ii), (iii), and (iv) cu:e ailso trivial. To see that (v) holds, let 
(ri,r 2 ,D) £ Hd and (fi,t 2 ) € D. Then, due to the way D is defined there 
is D' such that (r.t,D') £ Hd- Condition (a) is also immediate by the way 
matching directives axe added to the tuples. To check condition (b) assume we 
have w' £ D' At I t' for all t' £ w'. But then we have r.t.w' £ H with t being 
backtrack enabled. The fact that |w'j < n together with property (v) of definition 
5 implies that r.w' £ H. Hence, by definition of D we have w' £ D. 

For the other direction assume Hd to be a prefix-closed (n)-D HPB. Define 
H by simply ignoring the matching directive D of triples {ri,r 2 ,D) £ Hd- It 
is clear that properties (i), (ii), (iii) and (iv) of the definition of (n)-HHPB are 
satisfied. To prove property (v), let r.t.w £ H such that t is backtrack enabled, 
and |w| < n. By prefix-closme oiHo we have (r, D), {r.t, D') £Hd iox some D, 
D' such that t £ D,w £ D', and the two conditions of property (v) are satisfied. 
But then we have w £ D by condition (b), and thus {r.w,D") £ Hd for some 
D" as required. 

Now that we have expressed the backtracking condition in a forwards fashion, 
we can proceed along the lines of the decidability proofs for HPB [16, 8]. We will 
sketch these proofs, and thereby explain the remaining steps of our decidability 
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proof. For this we need a further definition from [8]. Let p — {Ep, <p,Lp, Ip) be 
a pomset and e, e' G Ep. Event e' is a maximal cause of event e in p iff e' <p e 
and there is no event e" G Ep such that e' <p e" <p e. 

The key insight of the proofs of the decidability of HPB is the following fact: 
two isomorphic pomsets stay isomorphic after the addition of a pair of transitions 
iff the maximal causes of the new events are the Scime (up to isomorphism) in the 
resulting pomsets. This means that we do not need to keep the entire history, but 
it is sufficient to record only those events that can act as maximal causes. The 
next step is to find a notion that contains this most-recent history, but is finite 
in the sense that there are only finitely many instances of it. Vogler develops 
the concept of ordered markings (OM), whereas Jategaonkar and Meyer find the 
notion of growth-sites. 

Instead of defining HPB on nms we can now base HPB on growth-sites or 
OMs. The resulting bisimulations are called gsc-bisimulation, and OM-bisimula- 
tion, respectively. Jategaonkar and Meyer show that gsc-bisimulation is indeed 
equivalent to HPB. Vogler proves the analogue for OM-bisimulation. As there 
are only finitely many growth-sites or OMs for a system, these bisimulations can 
be decided by exhausitive search. The decidability of HPB is then immediate. 

We can define a growth-sites or OM bisimulation that corresponds to (n)-D 
HPB just as well, and we call the resulting notions (n)-D gsc-bisimulation and 
(n)-D OM-bisimulation. The proof that (n)-D gsc- and (n)-D OM-bisimulation 
indeed coincide with (n)-D HPB is a straightforward adaptation of the proofs 
in [8] and [16]. Since there are only finitely many matching directives of size 
n, (n)-D gsc- and (n)-D OM-bisimilarity can also be decided by exhausitive 
search. Consequently, (n)-D HP bisimilarity is decidable and with it (n)-HHP 
bisimilarity. 

Theorem 1. For any n, it is decidable whether two finite nets are (n)-HHP 
bisimilar. 



4.2 Strictness of the Hierarchy 

It is a simple consequence of the definition, that HHP bisimilarity implies (n)- 
HHP bisimilarity for any n, which again implies (n’)-HHP bisimilarity for n' < n. 
Given the result of the previous section, an obvious question to ask is whether 
HHP bisimilarity coincides with (n)-HHP bisimilarity for some fixed bound n. 
The example of Fig. 1 shows that (O)-HHP bisimilarity is weaker than (l)-HHP 
bisimilarity. Fig. 2 shows an elegant generalisation, which discriminates (n)- 
hereditary from (n-M)-hereditary HP bisimilarity. Despite its simple appearance, 
it was not at all trivial to find. 

Let us first argue why no HHPB relates N and N'. In any HHPB we must 
match ai with a[, and bi with for 1 < i < n. Then one option in N' is to 
perform a'^^i and These transitions have to be matched with either a„+i 

and bn+i, or a „+2 and bn +2 respectively. Suppose we choose the match a„+i, 
bn+\. We can now backtrack all the a- transitions such that d becomes enabled 
in TV'. But no d action is possible in TV. If we choose a„+ 2 , bn +2 as our match. 
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Fig. 2. The nets N and N' are (n)-HHP bisimilar but not (n-l-l)-HHP bisimilax. 



we can backtrack all the 6-transitions. Then c becomes possible in N', but not 
in N. The systems are clearly (n-l-l)-bounded asynchronous, so by Prop. 3 N 
and N' are not (n-l-l)-HHP bisimilcir either. 

The above counter-strategy does not apply for (n)-HHPB, but we can use 
the following strategy to match the critical n-\-l transitions. Say we have to 
match and 6jj^j has not been fired yet, i. e. we can still choose between 

a„+x and a „+2 as a match. We make ovu* match dependent on the first transition 
in the history. Assume it is an a-tremsition. Then it is safe to match a'^^i with 
a„+i, which determines that 6^+x is later matched with bn+i- For d to become 
enabled in N', we need to backtrack all the o-transitions, however there will be 
n -h 1 6-transitions following the first a, so this is not possible. Similar, it is safe 
to match aj, 4_2 with a„+ 2 - A symmetrical argument applies if the first action was 
a b-action, and similar for the remaining cases. 

Lemma 2. For all n € INq, there exist two finite (n)- but not (n+l)-HHP bisim- 
ilar nets. 

Theorem 2. For all n € IVq, (n)-HHP bisimilarity is strictly weaker than 
(n+l)-HHP bisimilarity, and hence (unbounded) HHP bisimilarity. 

5 Applications to the Decidability Problem of Hereditary 
HPB 

In the previous section we have shown that the hierarchy of (n)-HHPBs is strict. 
However, for any two fixed finite systems the hierarchy collapses, and so the 
decidabihty of the general problem would follow immediately, if the bound can 
be effectively computed for any two given systems. At the moment, we tend to 
believe that this is indeed the case. Below, we will see that for some restricted 
classes of systems, the decidability of HHPB does reduce to deciding (n)-HHPB. 

Bounded Asynchronous Systems. We say that a net N is bounded asyn- 
chronous, if there exists some natural number n such that N is (n)-bounded 
asynchronous. Since finite 1-safe nets have only finitely many markings, it is 
decidable if a net is (n)-bounded asynchronous for some n, and the bound n can 
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be computed. With Prop. 3 the decidabihty of HHPB for bounded asynchronous 
systems follows immediately. 

Proposition 1. For bounded asynchronous nets, HHP bisimilarity is decidable. 

Systems with Transitive Independence Relation. An independence rela- 
tion I over an alphabet S is transitive if, for every distinct t, t', t" £ S, 1 1 1' A 
t' 1 1" imphes 1 1 1". 

Let W be a net. A transition t £ Tn is a, self-loop iff *t = t*. Intuitively, a 
self-loop is a transition that can be repeated immediately, i. e. independently of 
the occurrence of other transitions. Note that the existence of a run r = r' .t.t 
implies that t is a self-loop (in our context of 1-safe nets). 

Let us first draw our attention to systems with transitive I that do not 
contain any self-loops. It is easy to see that for such systems the number of 
transitions over which can be backtracked is boimd by the size of the maximal 
independence clique. In other words, a system with maximal independence chque 
of size k is (k)-bounded asynchronous, and hence decidabihty for finite systems 
of this subclass is immediate. 

If a system contains a self-loop that can occur concurrently with another 
transition, then for ah n this system is clearly not (n)-boimded as 3 mchronous. 
However, we can transfer the decidabihty result to the full class of finite systems 
with transitive I with the help of another key observation. In every (H)HPB 
between two systems with transitive I, self-loop transitions that can occur con- 
currently with other transitions, have always to be matched to self-loops. Hence, 
we do not need to consider the unfoldings of such self-loops. It is sufficient to 
match the first occurrence of such a trjuisition, when we make smre that the 
match is indeed a self-loop. But then the number of transitions over which one 
can backtrack is again bound by the size of the maximal independence clique, 
and so we have established decidability. The details of the proof Ccin be found in 

[7]. 

Theorem 3. For finite systems with transitive I, HHP bisimilarity is decidable. 

6 Final Remeirks 

There is still undiscovered land in the zone between plain and hereditary HPB. 
One possibility to advance the frontier is to identify system classes for which the 
two notions coincide. Several classes of such systems have already been found. 
The most interesting one is the system class of BPP in full standard form [6]. 
Plain and hereditary HPB for the class of free-choice nets have recently been 
shown not to coincide by the first author, disproving a conjecture in [3]. 

The trace-theoretical characterization looks promising for approaching the 
decidability problem of HHPB, see [6] for more details. 
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Abstract. The maximal strong and weak bisimulations on any class of 
processes can be obtained as the limits of decrecising chains of binary rela- 
tions, approximants. In the case of strong bisimulation and Basic Process 
Algebreis this chain has length at most u> which enables semidecidability 
of strong bisimilarity. We show that it is not so for weak bisimulation 
where the chain can grow much longer, and discuss the implications this 
has for (semi)decidability of weak bisimilarity. 



1 Introduction 

Algebraic descriptions are often used in concurrency theory for the design and 
specification of concurrent systems. There exist powerful process calculi such as 
CCS [6] that are very expressive. However, these calculi have the disadvantage 
that testing a property of a system may become infeasible. Simpler classes of 
processes are often built around fewer operators and do not possess the full ex- 
pressive power but they enable more efficient testing. One class of processes that 
has been studied a lot recently describes systems that can compose in a sequen- 
tial manner. These are Basic Process Algebras (BPA), originally introduced in 
[1] as the process equivalent of context-free grammars. Although the structure of 
BPA-processes is simple they capture quite a large class of infinite behaviours. 

One of the properties of systems that we are interested in is behavioural 
equivalence. For the sake of system design and verification we need to be able 
to specify and test when processes are equivalent with respect to some notion of 
observation. Among the most favoured behavioural equivalences are strong and 
weak bisimulations, introduced in [6]. One of the main issues for equivalences is 
the decidability problem; we want to determine whether a particular equivalence 
can be decided for any pair of processes from a fixed class. 

Strong bisimulation equivalence cam be decided on the class of BPA-processes 
(see e.g. [2], [3]). The classical test ([3]) consists of two semidecision procedmes. 
The algorithm for semideciding strong bisimilarity is based on the fact that for 
every BPA there exists a finite base from which all bisimilar pairs can be derived. 
The algorithm for semideciding non-bisimilarity approximates the maximal bi- 
simulation equivalence with a possibly infinite decreasing sequence of binary 
equivalences that always converge, with the limit being strong bisimulation. 

Weak bisimulation equivalence takes into account special behaviour whereby 
a process can engage in internal transitions that are not observable by an outside 
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observer, and hence do not have to be matched by an equivalent process. Internal 
behaviour is denoted by and any external transition can be proceeded and 
followed with an arbitrary sequence of — When we abstract away from internal 
transitions even simple systems as BPA become infinitely branching, i.e. the 
transition trees determined by BPA-processes may contain infinite branching. 
This poses a potential complication to equivalence testing. Indeed, even for the 
rather simple processes of BPA the decidability problem for weak bisimulation 
equivalence remains open. 

At present there are some partial results that assert decidabiUty on strict 
subclasses of BPA-processes ([4], [8]). However, these are subclasses where the 
power of internal behaviour is in some ways restricted. In the general case we are 
not even able to semidecide weak bisimilarity or weak non-bisimilarity. In this 
paper we concentrate on the technique for semideciding non-equivalence that is 
used for strong bisimilarity and we show that no straightforward application to 
weak bisimilarity seems possible. 

2 Background 

In order to define Basic Process Algebras we presuppose a fixed set of actions 
Act = {a, 6, c, . . . } that contains a special silent action r, and a finite set of 
process variables or atoms E = {Xi,...,X„}. A Basic Process Algebra (BPA) 
is then a pair {E*, A), where E* is the fi'ee monoid generated by E, and A = 
{X P\XEE,PEE*,fj,£ Act} is a finite set of transitions. BPA- 
processes are identified with words from E*. The transition rules of A determine 
a transition relation on general BPA-processes in this way: 

XQ PQ if there is a rule X P in A. 

We will use capital letters X, Y to range over process variables, P, Q, R to range 
over BPA-processes, and fi, X to range over actions. 

Example 1. Here we present a simple BPA. The set of variables E is {X, T} and 
the transition rules of A are given below: 

y e X ^XY X e. 

The transition tree determined by the variable X is sketched in figme 1. 

The process X can with a sequence of n transitions generate n copies of 
Y. For Y to perform any move the process has to dispose of X in front with a 
— ^ move. Only then any action of Y can be done. It is always the leftmost 
variable in a BPA-process that is allowed to carry out a transition. □ 

In order to incorporate the notion of internal behaviour we consider composite 
actions =^, where is an abbreviation of (-^)* (~^)* in case r and 

(— ^)* in case p = r. The process X fi'om Example 1 gives rise to an infinitely 
branching tree that is shown in figme 2. 
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Fig. 1. The transition tree of the process X 




Fig. 2. X as an infinitely branching tree 



We say that a binary relation TZ on processes is a weak bisimulation if for every 
pair {P, Q) from TZ and every action from Act the following holds: 

— for every P P' there exists Q Q' so that {P', Q') € TZ\ 

— for every Q Q' there exists P P' so that [P' , Q') S TZ. 

Two processes P and Q are weakly bisimilar if there exists a weak bisimulation 
containing the pair (P,Q). The union of all weak bisimulations gives rise to 
the maximal weak bisimulation which is denoted by w. An equivalent definition 
of weak bisimulation is phrased in terms of a single transition in the premise 
followed by a composite transition. Both definitions yield identical maximal weak 
bisimulations and we shall be using either of them, depending on the context. 

In the definition above, the maximal weak bisimulation is obtained as the 
union of smaller weak bisimulations. There exists an alternative approach (see 
Milner [6]) where the maximal equivalence is obteiined as the hmit of a decreasing 
chain of weak bisimulation approximants. These axe binary relations on processes 
labelled by ordinal numbers and defined inductively as follows: 

- P «o <3 for all P and Q] 

- P Wq+1 Q if for all actions p, 
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• whenever P P' then there exists Q Q' so that P' Wq Q'-, 

• whenever Q Q' then there exists P P' so that P' «« Q'] 

— P Q if P fUa Q for every a < X, for a limit ordinal A. 

Ordinal numbers are representatives of classes of well-ordered sets. They form a 
class, denoted by On, and are themselves well-ordered by the element-of relation 
<■ We shall be using some simple arithmetical operations on ordinals such as 
summation and multiplication. They will also provide a measure for derivation 
trees of particular processes which will imphcitly refer to the notion of rank 
(height) of a tree. For a detailed instruction on ordinals the reader should consult 
standard textbooks on set theory, such as [5]. 

The binary relations are equivalences for every a. The following lemma 
sums up the structmre of the chain of approximants and the relationship between 
individual approximants and the maximal bisimulation. A proof can be found 
in [6], [9]. 

Lemma 1. 

1. for every a,f3 £ On, a < 0 => C 

2. for every a £ On, « C / 

3. if there is an a such that = '^a+i then for all /? > a, «« = «/? = w; 

4- ^ = riaeOn 

1 says that approximants form a non-increasing sequence. 2 says that the ma- 
ximal equivalence is contained in every approximant. 3 eind 4 state that the 
sequence converges and the limit is «. 

Note: The notion of strong bisimulation ~, resp. strong bisimulation appro- 
ximants ~Q, is defined analogously to weak bisimulation, resp. weak bisimula- 
tion approximants, where the composite transition is replaced by the single 
transition A lemma analogous to Lenuna 1 holds, i.e. the chain of strong 
bisimulation approximants converges and the limit is ~. For finitely branching 
processes (such as BPA-processes) the convergence occurs at level w, that is 
~ = ~u. = n„€u. ~"- 

Every BPA-process has only finitely many possible derivatives therefore each 
approximant is decidable. Then a straightforward semidecision procedmre 
for non-bisimilarity proceeds by successive enumeration of all natural numbers 
n and testing equivalence at ~n- However, we shall see that this approach cannot 
be used for weak bisimulation approximeints as we shall establish that there exist 
BPA where w C 

3 A Hiereirchy of Non-bisimilar BPA-processes 

In this section we will construct a hierarchy of processes that will distinguish 
individual approximants (and all approximants in between) from the ma- 
ximal weak bisimulation «. We will start with a simple construction that will 
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later lend itself to straightforward generalisation. We define variables C and A 
by these transition rules: 



A^e A-^e C CA C e. 

The transition tree determined by the variable C is drawn in figiure 3. The tree 
contains infinite branching at the top level as C can generate with a move 
any number of copies of A. It is not dfficult to see that A^ A^ for every 
A: < 1. In order to distinguish from « it suffices to consider the processes C 
and AC. 

Lemma 2. C AC and C 56 AC. 

Proof. We will show that the processes C and AC are equivalent at for 
all k but not weakly bisimilar. Any move of C can be matched by AC after 
discarding the A in front with AC C. To the move AC — ^ C the variable 
C responds with C C. Hence we only need to consider the transition 
of AC. For a fixed k, if AC C then C generates enough copies of A with 
C — ^ for some N > k. Then C and A^ will surely be related 

at Wfc and we can conclude that C «« AC. On the other hand, because AC 
has one more copy of A at its disposzJ, we obtain that C ‘^u+\ AC and hence 
C 56 AC. □ 

We can use the two variables C and A to reach even higher. If we consider the 
processes C" and AC" for some n > 1, then we can repeat the trick of generating 
arbitrary many copies of A several (at most n) times. Informally, each C gives 
rise to an infinitely branching tier in the tree and since we have n copies of C 
to start with we construct a tree of height w • n. Then by using the very same 
argument we can conclude with the following lemma that will be stated without 
a proof. 

Lemma 3. For every n € N, C" «u; n AC" and C" ii AC". 

This idea can be generalised to an infinite hierarchy of variables that will enable 
us to go beyond u;^. We define the variables inductively in this way: 

1. Do e, Do e; 

2 . assuming we have defined the variables Do, . . . ,Di, the variable Dj+i is 

defined by ^ Dj+iDi, Di+i e. 

The variable A, resp. C, is renamed Do, resp. Di. Notice that the only variable 
capable of performing a visible action is Do. The purpose of the other vari- 
ables is to create bigger and bigger branching. We would hke to show that for 
every i, we can construct a pair of processes Pi and Qi, sequences over variables 
Do,Di, . . . ,Di, with the property that Pi Qi and Pi ^ Qi. Before we carry 
out the desired construction we explain the behaviour of individual variables Di 
by analysing all the possible moves of Di. 
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Fig. 3. The process C 



Starting from a variable Di we can only do a sequence with which we 
obtain the process DiD^tTi for some e<_i. We cannot get a more complex shape 
without removing the Di in front. After having discarded Di we can continue and 
from Di*Si generate (with another sequence) the process Di-iD^'J'^ • 
We repeat the procedure several times and finally derive a process either of the 
form Dk+iDl'' . . . D^ or Dl’’ . . . D^ , where A: > 0, m < i and 
> 0. The latter process will be denoted by called a 

product. We will see that it suffices to consider products as it is not difficult 
to convince oneself that every Dk+iD^'’ D^'^^ . . . D^ is wealdy bisimilar to the 

product Ili^fc +2 That is obtained as a 

corollary of the following proposition: 

Proposition 1. For every k, m, and I, « Dk+iD^D]^_^_^. 

The correctness of this proposition is established by checking that the relation 
n = {{Dk+iD^Di^,,Dk+iD]:Dl^,)\kJ,m,neN}u{{D,D)\Dk+iD^Di^, ^ 
D, k,l,m € N} is a weak bisimulation which consists in verifying that it is closed 
under expansion. 

We will define a measure on products that will enable us to make statements 
about the largest ordinal number that relates two non-bisimilar products. The 
measure is not chosen arbitrarily but captures in some way the branching power 
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that processes possess. To a product process we assign an ordinal 

number ~ H b + cq. This notion has the 

special property that all derivatives of a product axe assigned a smaller or equal 
ordinal which will be shown later. 

Example 2. We consider the variables D^, D\ and Dq. Starting from D^, we can 
perform this derivation sequence; H2-D1 D\ — > DxD^D\ — > 

DqD\ ^0^1 On the other hcmd, there is no derivation sequence that 

would produce the process D^DiOq. The ordinals assigned to each element of 
the derivation sequence are to D 2 , then a; • 3 to D\, a; • 2 + 5 to DqD\, and 
w2+4 to DqD\. Finally, processes D\D^ and D\DqD\ are weakly bisimilar. □ 

The following proposition formalises the preceeding arguments and specifies all 
possible derivatives of an mbitrary product. It shall help us to create intuition 
about the relationship between individual products in terms of their branching 
capacity. 

Proposition 2. For a product process P, if there is a deriva- 
tion n^o ^ P is weakly bisimilar to some fli^o > ®here 

p = a, and ^ 

p = T. 

Proof (Sketch). We assume a fixed product IlHoA*' *he corresponding 
If Co > 0 then before removing all copies of JDo, all sequences of 
transitions lead to a product of the form DqD\' ... with e < eo and therefore 

also H b u;ei + e < 1- wei + Cq. 

Once we have exhausted all Dq what remains is some product 
with j > 0. This product can either step by step remove some of the front vari- 
ables which results in some ...D^, where k > j or k = j and < ej. 
The respective ordinal is then -b • • • -b which is less than the orig- 

inal ai^Cm + ■■• + Or, after a few such steps some variable Dk, k > j, 
performs a sequence of transitions Dk DkDk-i which results in the process 

• • • D^. Again, the respective ordin^ll H b ci;*'(e'j, - 1) -b 

is less than w'"emH bw^ej. Then we can apply Proposition 1 to con- 

clude that DkD^ffs ^ is actueilly weakly bisimilar to . . . D^. 

□ 

We will conclude this section with a couple of results that describe the rela- 
tionship between pairs of products and individual approximants. The following 
Lemma 4 specifies the maximal ordinzd that relates two (non-bisimilax) products. 
Assuming two products P and Q that 2ire assigned ordinal numbers (3 and 7, P 
and Q will surely be equivalent at where a is any ordinal up to the minimum 
of (3 and 7. 

Lemma 4. For all products, nT=oDv «« ut=oD(\ where a < min{/3, 7} with 
(3 = w'^em+io”^~^em-i3 hwei-l-eo andj = fm+u)"^~^ fm-i3 bw/i-H/o- 
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We shall try to describe on an intuitive level why this is so. We assume that 
initially we have two products P and Q, and ordinals a < min{fi, 7} where j3 
is the ordinal assigned to P and 7 is assigned to Q. Without loss of generahty, 
we suppose that < 7. That means that Q has the ability to evolve exactly 
into P with a => and hence can copy all its moves. Hence it is the process P 
that needs to keep up with Q and in order to do so it will respond to moves 
of Q with the minimal loss of power. To demonstrate equivalence at with 
^ < a, if Q performs then P will choose such a derivation that the resulting 
product P' will determine an ordinal t)> 5. The possibility of such a move for P 
follows from Proposition 1. By finite apphcation of these steps we reach 0 where 
all processes are equivalent and consequently, P has demonstrated equivalence 
to Q at level a. 

Lemma 5 provides an estimate of an ordinal that distinguishes a pair of 
non-bisimilar products. Given two products P and Q that are assigned ordinal 
numbers 0 and 7, P and Q will not be equivalent at where a is strictly 
greater than the minimum of /? and 7. 

Lemma 5. If Hilo Hilo Illlo Ili^o “ > 

min{/3,7} with /3 — + ••• + wei -|- cq and 7 = UJ"'fm + 

H + <^fi + fo- 

The argument that supports this lemma is in fact very similar to the intuitive 
explanation we have presented to justify the claim of Lemma 4. Again we strongly 
rely on Proposition 1. This time it is the product that is assigned a greater 
ordinad that takes care not to go below the cmrent level. To be more precise, if 
P 3ind Q eire the examined products, Q being the larger product cind the level 
of inspection being a then appropriate moves of Q will be those that remain 
above a. From the assumption it follows that P is less than a and therefore all 
its moves will always stay below. By iterating transitions so that they satisfy 
this condition we eventually reach a stage where some derivative of Q will have 
move at its disposal but no P derivative will be able to match it. Thus the 
inequivalence of P and Q will be sealed. 

We have sketched the ideas of the proofs of Lemma 4 and Lemma 5. The 
complete proofs axe rather techniczd as they proceed by transfinite induction on 
the ordinals that are assigned to the pmr of processes in question. As they are 
also rather lengthy we do not include them in the paper. For full versions of the 
proofs together with the full proof of Proposition 2 consult [9]. 

4 Lower and Upper Bounds 

In this section we will use the results of the previous section to deduce a lower 
bound on the ordinal where convergence occurs for weak bisimilarity on BPA. 
For a fixed n we shall define a Basic Process Algebra where 27„ = 

{Do, Di, . . . , D„} and An = (Do — > e, Do — >• e, Dj+i — > Di+iDi,Dt+i — > 
e I 0 < i < n}. We consider the two processes D„ and DoD„ from E*. Following 
the approach of the previous section we assign Dn the ordinal w”, and D^Dn 
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the ordinal w" + 1. Then we can apply Lemma 4 and Lemma 5 to obtain these 
results: 



Therefore we can conclude that on the algebra {E*, An), one can distinguish the 
approximant from the maximal weaJc bisimulation This can be carried 
out for any n hence we come to the following conclusion: 

Theorem 1. For every a < there exists a Basic Process Algebra (E*,A) 
such that « c with respect to {E*,A). 

The implication is that a lower bound on the convergence level to w is If we 
analyse the construction we can see that in order to reach higher levels we need 
to introduce new variables. Since we are only allowed to use a finite number of 
variables in the definition of a BPA this leads to the following conjectmre: 

Conjecture 1. For Basic Process Algebras, 

Now we shall try to establish some upper bounds on the level of convergence. 
That does not seem to be so easy as we do not have appropriate tools that could 
establish the maximal level of convergence, even for a specific algebra. It seems 
that the only claim we can make stems from the fact that the process algebras 
we deal with are countable. We have already showed that 

SSq, = >" ~a — ~ ^ 

that is if two subsequent levels a and a + 1 define the same equivalence then all 
levels /3 for a < /3 are equal and hence equcil the maximeil weak bisimulation. 
We can define only countably many processes and hence countably many pairs 
of processes which means we can never distinguish more than countably many 
approximants. That can be expressed as follows: 

Lemma 6. « == ■ 

Obviously, this is a rather crude upper bound (wi is the first uncountable or- 
dinal). Bradfield observed that there exists a stronger upper bound that can 
be obtained as follows. Non-bisimulation is an inductively defined property, and 
the monotone (and indeed positive) operator over which induction occurs is 
arithmetical, since the relation for BPA is clearly arithmetical. There is a 
theorem due to Spector (consult Theorem IV.2.15 in [7]) that any inductive def- 
inition over a monotone axithmeticEd (or even Il\) operator has closure ordinal 
< , the least non-recursive ordinal. 

5 Discussion 

To summarise the importance of the presented results we will compare strong 
and weak bisimulation equivalences with regeird to decidability. The classical de- 
cision procedure for strong bisimilarity consists of two semidecision procedures. 
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The algorithm for semideciding bisimilarity searches for a finite base of The 
complementary algorithm for semideciding checks individual approximants 
r^n step by step, and tests for equivalence at n. 

The construction that was presented in the paper poses a serious problem to 
the semidecidability of 96. Obviously enmnerating approximants up to some level 
u>" does not seem feasible. Moreover we may not be able to check equivalence at 
Riljn . The aforementioned construction is rather simple yet it is already not clear 
what an appropriate method for testing (non) bisimilarity of a pair of processes 
would be. 

On the other hand, it seems plausible that in general there might exist a finite 
base for the maximal weak bisimulation. Indeed it is rather easy to construct a 
finite base for fa for every BPA (H„, A„) from the presented construction. Thus 
we may conclude that semidecidability of « appears plausible in contrast with 
semidecidability of 9^! which seems to require an entirely new technique. 
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Abstract. It is a cl^lssical result from graph theory that the edges of 
an /-regular bipartite graph can be colored using exactly I colors so that 
edges that share an endpoint are assigned different colors. In this paper 
we study two constrained versions of the bipartite edge coloring problem. 

- Some of the edges adjacent to a pair of opposite vertices of an /- 

regular bipartite graph are already colored with S colors that appear 
only on one edge (single colors) and D colors that appear in two 
edges (double colors). We show that the rest of the edges can be 
colored using at most max{min{l + D, y},/+ total colors. We 

also show that this bound is tight by constructing instances in which 
max{min{/ + D, ^ },/ + colors are indeed necessary. 

- Some of the edges of an /-regulaur bipartite graph are already colored 
with S colors that appear only on one edge. We show that the rest of 
the edges can be colored using at most max{/ + S/2, S} total colors. 
We also show that this bound is tight by constructing instances in 
which m£ix{/ + S/2, S} total colors are necessary. 



1 Introduction 

It is a classical result from graph theory [9] that the edges of an /-regular bipartite 
graph can be colored using exactly / colors so that edges that share an endpoint 
are assigned different colors. We call such edge colorings legal colorings. 

Konig’s proof [9] is algorithmic, yielding a polynomial time algorithm for 
finding optimal bipartite edge colorings. Faster algorithms have been presented 
in [4,5,2,12]. These algorithms usually use as a subroutine an algorithm that 
finds perfect matchings in bipartite graphs [6, 12]. 

Bipartite edge coloring can be used to model scheduling problems, e.g. time- 
tabling. An instance of timetabhng consists of a set of teachers, a set of classes, 
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and a list of pairs {t, c) indicating that the teacher t has to teach class c during a 
time slot within the time span of the schedule ([12]). A timetable is an assignment 
of the pairs to time slots, in such way that no teacher t and no class c occurs in 
two pairs that axe assigned to the same time slot. Obviously, this is a bipartite 
edge coloring problem. Usually, additional constraints are put on a timetable, 
making the problem NP-complete [3]. 

In this paper, we study two constrained versions of the bipartite edge coloring 
problem. Our first constrained version (problem A) can be described in the 
following way. We are given an 1-regulcU- bipartite graph G — {V\,V 2 , E) along 
with a partial legal coloring of its edges that specifies a color for edges incident 
to vertices v\ 6 V\ and V 2 € V 2 . Therefore, each color can be used either on 
one edge, in which case we call it a single color, or on two edges one incident 
to vi and one incident to V 2 in which case we call the color a double color. If 
we denote by S the number of single colors, D the number of double colors and 
by U the number of edges incident to vi and V 2 which are uncolored, we have 
that 2D + S + U — 21. We want to color the remaining edges of the graph so to 
minimize the total number of colors used. 

The case where 17 = 0 has been studied in [11,7,10,8]. For this case, I + 
total colors are necessary and sufficient [8]. Mihail et al. [11] gave the 
first (but not tight) solution to the specific subcase where S = 2D = I and 
showed how this solution can be used to approximate the wavelength routing 
problem in trees. The edge coloring problem is solved by obtaining matchings 
of the bipartite graph, and coloring them in pairs using detailed potential and 
averaging argmnents. 

The papers [7, 10, 8] also use a bipartite edge coloring algorithm as a sub- 
routine of a wavelength routing algorithm. Both [7] and [10] concentrate on the 
special case where 5 = 2D = I and color the bipartite graph using 1 + ^^ — 71/4 
total colors. The main idea of the algorithm in [7] is similar to the one of [11] 
but new techniques are used for partitioning the bipartite graph matchings into 
groups that can be colored and accounted for independently. Implicitly, Kumar 
and Schwabe [10] solve the same problem using different techniques. The main 
part of our analysis is a generalization of [8]. 

The second constrained version of the bipartite edge coloring problem (prob- 
lem B) is slightly different. We cire given an i-regulax bipeirtite graph G — 
(Fi,F 2 ,£^) along with a pcirtial legal coloring of some of its edges. Each color 
is used only on one edge. We denote by S the number of colored edges. Our 
objective is to color the remaining edges of the graph so to minimize the total 
number of colors used. To our knowledge, problem B has not been studied yet. 



Summary of results. Om results for problem A can be summarized in the 
following two theorems. 

Theorem 1. There exists a polynomial time algorithm that properly colors the 
uncolored edges of an I -regular bipartite graph constrained by S single and D 
double colors using at most max{min{/ D,^},1 colors. 
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Theorem 2. For each S > 0, D > 0 such that S + 2D < 21, and for each 

1 > 0 there exists an l-regular bipartite graph constrained by S single and D 

double colors for which any legal coloring of the remaining edges requires at least 
max{min{/ + D, ^},l + total colors. 

The resiilts for problem B are the following. 

Theorem 3. There exists a polynomial time algorithm that properly colors the 
uncolored edges of an l-regular bipartite graph constrained by S colors using at 
most max{/ + S/2, 5} colors. 

Theorem 4. For each S >Q, and for each I > 0 there exists an l-regular bipar- 
tite graph constrained by S colors for which any legal coloring of the remaining 
edges requires at least max{Z + S/2, S} total colors. 

The rest of our paper is organized as follows. In Section 2, we prove Theorem 1 
by giving an algorithm that solves problem A. In Section 3, we present our lower 
bounds for the problem. The results for problem B are outlined in section 4. 

2 The upper bound for problem A 

In this section we present our algorithm for solving problem A. 

The algorithm receives as input an /-regular bipartite graph G = (Vi, V 2 > E) 
with Vi = {Wo)‘" ,Wn} and V 2 = {Ao,--- ,A„}, where some edges incident 
to Wq and Aq have been colored using S singles and D double different colors. 
We call edges incident to Wq and Aq the source edges. We assume without loss 
of generality that no edge connects Wq and Aq. If a color appears on only one 
source edge, then we call it a single color. If it appears on two source edges, we 
call it a double color; note that one of these two somce edges has to be incident 
to Wo and the other to Aq. We denote by D and S the nmnber of double and 
single colors, respectively. 

We proceed by decomposing the bipartite graph into I perfect matchings 
which can always be done since the graph is /-regular. Each such matching 
includes exactly two source edges: one incident to Wq and one incident to Aq. 
A double color is Ccilled separated if its two source edges appear in different 
matchings. On the other hand, if they appear in the same matching then the 
color is said to be preserved. We classify the matchings into seven types: UU, 
US, UT, TT, PP, SS, TS, based on their corresponding source edges. If both 
the source edges of a matching are not colored, then the matching is of type 
UU. If one source edge of the matching is uncolored and the other source edge is 
colored with a single color, then the matching is of type US. If one source edge of 
the matching is uncolored and the other source edge is colored with a separated 
color, then the matching is of type UT. If the two source edges of a matching 
are colored with separated colors, then the matching is of type TT. If the two 
somce edges are colored with the same preserved color, then the matching is 
of type PP. If the two source edges are colored with two single colors, then the 
matching is of type SS. If the two soiuce edges are colored with a single color 
and with a separated color, then the matching is of type TS. 
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Chains and Cycles of Matchings. We partition the matchings of types UT, 
TT, TS into groups. Each such group is either a chain or a cycle of matchings. A 
chmn of matchings is a sequence (Mo, Mi, - • • , Mk-i) of k matchings such that 

1 . Mo £md Mfc_i me matchings of type ST or UT; 

2 . Ml, • • • , Mfe_2 are all matchings of type TT; 

3 . for emh 0 < i < fc — 2 , matchings Mi and Mj+i share exmtly one double 
(sepmated) color. A chain consists of at least two matchings and is of type 
S-S,S-U, or U-U. 

A cycle of matchings is a sequence (Mo, Mi, • ■ ■ , Mk-i) of k TT matchings 
such that, for emh 0 < i < A: - 1 , matchings Mj and Mj+imod* share exactly 
one double (sepmated) color. 



Minimal Chains and Cycles. A sequence C of matchings (chain or cycle) 
is minimal if it does not contain any two parallel source edges. A non-minimal 
sequence of matchings can be spht into two shorter sequences in the following 
way. Consider the sequence C = (Mq, • • • , Mi-\) of matching and suppose that 
the edge colored Cj of Mj and the edge colored Cj of Mj are parallel. We exchcinge 
the two edges thus obtaining two new matchings M( emd Mj with source edges 
colored cj and Cj+i and C{ and Cj+i zmd the two new sequences of matchings 
Cl = (Mo, Ml, . . ■ Mi_i, Mj, Mj+i , . • • , M,_i) and C2 = (M', M^+i, ■ • ■ , M,_i). 
The sequence Ci is of the same type (i.e., a cycle or a chain) as C while C2 
is always a cycle. We repeat this process of splitting one sequence into two 
new sequences imtil all sequences are minimal (i.e., they do not contain parallel 
edges). 



2.1 Coloring the matchings 

In this section we demonstrate how to color groups of matchings. 



Coloring two consecutive matchings. We will first present two alternative 
ways for coloring two consecutive matchings. These techniques will be used for 
coloring cycles or chcdns. We consider two consecutive matchings Ti = (x, y) and 
T2 = (y, z) together as a cycle cover of the bipartite graph. We assume that the 
cycle cover of two matchings consists of one single cycle that spans the entire 
bipartite graph. We remmk that our colorings can be easily adapted if such a 
cycle cover consists of more them one cycle. 

1 . We use the colors x,y,z as and color the uncolored edges without using amy 
new color. Let ei, 62 be the edges of the cycle cover that are adjcicent to the 
source edge colored with color x that does not belong to matchings Ti 
and T2. Note that since Cx belongs to a matching of the same minimal chciin 
or cycle with 7 ) and T2, it cannot be pareillel to the source edges colored 
with z or y. 

We use colors y and z to color ei 3md 62- Similarly, we use colors y and x 
to color edges 63 and 64 (edges of the cycle cover that are adjeicent to the 
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source edge Cz colored with z that does not belong to matchings Ti and T2). 
The remaining imcolored edges of the cycle cover cein be colored using colors 
X and z alternatively (and possibly using color y in one more edge to break 
parity). 

2 . We use color y and a new color n to color the uncolored edges. We color 
the uncolored edges of the cycle cover using color y and the new color n 
alternatively. 

Both colorings are depicted in Figure 1 . Note that both colorings work if x or 
is a single color. 




' Z X 




Fig. 1 . Two alternative colorings of two consecutive matchings. 



Easy colorings. PP, SU, and UU matchings can be easily colored. Edges of a 
PP matching cire colored using the double color. Edges of and SU matching are 
colored using the single color. Edges of a UU matching are colored using a new 
color. 



Coloring cycles. Using the two alternative coloring of consecutive matchings 
we can color a cycle of length t using [ |] new colors. 

Coloring cycles of length 4 k. Let Mq = (xo,j/o)i Mi — (j/0,^0), M2 = (zq,wo), 
Mik-i = (u'fc-ii 3:0) be such a cycle. For every 0 < i < fc we color consecutive 
matchings M/n = {xi,yi) and M4j+i = {yi,Zi) with colors Xi,yi,Zi and consec- 
utive matchings M4j+2 = {zi,Wi),Ma+z = (u;j,Xj+i) with color Wi and a new 
color Uj. 



Coloring cycles of length 4 k -h 1 . Let Mq = (xo,yo), ■■■,M4k — (xk,xo) be such 
a cycle. For every 0 < i < A: we color consecutive matchings Mu = (xi,yi) and 
M4i+i = {yi, Zi) with colors Xj, yi,zi and consecutive matchings M4i+2 = {zi, Wi) 
and M4i+3 = (t/;i,Xi+i) with color and a new color rij. The matching M4k is 
colored with a new color n^. 
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Coloring cycles of length 4fc + 2. Let Mq = (xq, j/o), M^k+i = {Vk, xo) be such 
a cycle. For every 0 < i < fe we color consecutive matchings Mii = {xi,yi) and 
Af 4 i+i = (j/t) Zi) with colors X{,yi, Z{ and consecutive matchings Mu +2 = {zi, Wi) 
and M 4 i +3 — (wi,Xi+i) with color Wi and a new color Uj. The matchings M 4 k 
and M 4 k+i axe colored with yk and a new color nk- 

Coloring cycles of length 4fc + 3. Let Mq = {xo,yo), M 4 k +2 = {zk,xo) be 
such a cycle. For every 0 < i < A: we color consecutive matchings M 4 , = {xi,yi) 
and M 4 i+i -- (yi,Zi) with colors xi,yi,Zi and for 0 < i < A: — 1 consecutive 
matchings M 4 j 4-2 = {zi,Wi), M 4 i +3 = (tWijXj+i) with color Wi and a new color 
ni- The matching M 4 k +2 will be colored with a new color njt. 



Coloring chains of type S-S. Using the two alternative coloring of consecu- 
tive matchings we can color a S-S chain of length t using new colors. 



Coloring chains of type S-U. Consider an S-U chain of length t. We assign 
the single color to the uncolored source edge. Now we have a cycle which can be 
colored as above. The number of new colors is 

Coloring chains of type U-U. Consider a U-U chain of length t. We use a 
new color and assign it to both uncolored source edges. Now we have a cycle 
which can be colored as above. The number of new colors is 1 -f 



Other colorings. We now discuss how to handle some interesting cases. 

An SS matching can be colored together with a U-U chain of length 2 using 
at most 4 total colors. First we assign the single colors of the SS matching to 
the uncolored source edges of the UT matchings of the chain. Now we have a 
cycle of length 3. If the cycle is minimal we can color it using one new color. 
Otherwise, we obtain a cycle of length 2 which cam be colored using one extra 
color and a PP matching which is colored in the obvious way. Obviously, we can 
color an SS matching together with two U-U cheiins of length 2 using at most 7 
total colors. 

A U-U chain of length 2 can be colored together with an SU matching using 
at most 4 total colors. We first assign a new color to the uncolored edge of the 
SU matching and we have an SS matching and a U-U chain of length 2 which 
is colored as described. 

A U-U chain of length 2 can be colored together with an S-S chain of length 
2 using at most 6 total colors. We first assign the single colors of the S-S chain 
to the uncolored edges of the U-U chain. Now we have a cycle of length 4. If the 
cycle is minimal we can color it using 1 new color. Otherwise, it is decomposed 
either into two cycles of length 2 which can be colored using 2 new colors, or 
into a cycle of length 3 which is colored using 1 new color and a PP matching 
which is colored trivially. 
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2.2 Analysis of the algorithm 

For analyzing the performance of our algorithm, we study four cases which are 
presented below. We note that analysis below intuitively reveals the inherent 
difficulty that the presence of uncolored somrce edges adds to the problem. 

Case 1. U < S. The valid inequahties we have to consider are the following 

U <D <1/2<S,U <1/2<D <S,U <S <1/2<D, 

U<l/2<S<D,D<U<l/2<S,D<l/2<U<S. 

All other cases violate the constraint 2D + S + U = 21. Note that in all cases it 
is I + > min{l + D, 3Z/2}. 

We use U single colors to color the uncolored source edges and we have a new 
partial coloring of somce edges with D' = D + U double colors, and S' = S — U 
single colors. Note that using the colorings described in the previous section, any 
set of S-S chains, cycles, PP and SS matchings of size k with Dk double and Sk 
single colors is colored using at most k+ total colors. Thus, the remaining 

edges of the bipartite graph are colored using at most I + ^ — I + total 

colors. SS matchings are colored using one of the single colors. 

Case 2. S <U and D >1/2. The valid inequalities we have to consider are the 
following 



S <1/2<D <U,S <U <1/2<D,S <1/2<U < D. 

In all cases it is 31/2 = min{Z + D,3l/2} >1+ 

We use the single colors and new double colors to color the uncolored source 
edges. We have a new partial coloring with I double colors. All matchings are now 
either PP’s or cycles. Using the colorings for PP matchings and cycles described 
in the previous section, the remaining edges are colored using at most 31/2 total 
colors. 

Case 3. S <D < 1/2 < U. Note that l+D > 1+^^. Let kss be the number of 
SS matchings, ksTTS the number of S-S chains of length 2, ksTTU the number 
of S-U chains of length 2, kurru the number of U-U chains of length 2, and 
kpp the number of PP matchings. 

Matchings except SS, PP, and chains of length 2 can be colored with 

, , 07 07 1 , ^ ~ ^STTS - ksTTU ~ kuTTU ~ kpp 

i—KsS —^K stTS — ^KsTTU —^KutTU — I^PP H 2 

colors. S-S and S-U chains of length 2 Eire colored with 3ksTTS + ^ksTTU colors, 
while PP matchings are trivially colored with kpp colors. Totally 

, , r,, , + ksTTS + ksTTU - kuTTU ~ kpp 

I - kss - ^kuTTU H 2 

colors. Now we distinguish between two subcases: 
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~ ^55 < kijTTU- Then kss SS matchings axe colored together with kss U-U 
chains of length 2 using 4fess colors. The rest kuTTU ~ kss U~U chains of 
length 2 axe colored with SkuxTU — ^kss colors. The total nmnber of colors 
is 

^ ^ D + ksTTS + ksTTU + kuTTU — kpp ^ 

2 

- kss > kuTTU- Then kuTTU SS matchings 2 ire colored together with kuTTU 
U-U chains of length 2 with AkuTTU colors. The rest kss — kuTTU SS match- 
ings axe colored with 2kss ~ 2kuTTU colors. The total number of colors is 



I + kss + 



D -|- ksTTS + ksTTU — kuTTU — kpp 



Note that 



S — 2ksTTS — ksTTU ^ D — 2ksTTS — ksTTU 
kss < 2 < 2 ’ 



so the total number of colors is at most l-\- D — 



Case 4- D < S <U. The valid inequalities we have to consider are the following 
D <S <1/2<U,D <1/2<S <U. 

Note that I + > I + D. Consider a set of matchings of size k consisting 

of cycles, U-U chains of length > 3, S-S and S-U chains, PP, SU, UU and 
SS matchings with Dk double colors and 5* single colors. The colorings of the 
previous section color any such set of matchings using k -I- colors. Thus, 

we only have to explain how to color U-U chains of length 2. 

Let kss be the number of SS matchings, ksu be the number of SU matchings, 
ksTTS be the number of S-S chains of length 2, ks-u be the number of S-U 
chains, ks-s be the number of S-S chains of length > 3, and kuTTU be the 
number of U-U chains of length 2. It is 

S = 2kss+ksu+‘^ksTTS+ks-u+‘^ks-s > > kuTTU+ksTTS+ks-u+'^ks-s ^ 

^kss + ksu + ksTTS > kuTTU- 

Thus, U-U chains of length 2 can either be grouped into pairs and colored 
together with an SS matching, or colored together with an SU matching or an 
S-S chain of length 2. 



3 Lower bounds for problem A 

Consider the graph of figme 2. Let D <1/2. Assume that there exist D edges 
between vertices Xq and Y\ colored with double colors. There axe D edges be- 
tween Yq and X 2 which axe either uncolored or colored with single colors. There 
axe also I — D edges between Yq and Xi which include all edges adjacent to Yq 
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that are colored with double colors. Then, for coloring the edges adjacent to X 2 
we cannot use the D double colors. Thus, Z + £) total colors are necessary. 

Let D >1/2. Select a set C oil /2 double colors and consider the following 
partial coloring. There exist 1/2 edges between vertices Xq and Y\ colored with 
double colors of C. There are 1/2 edges between Yq and X 2 which are either 
uncolored or colored with colors not in C. There are also 1/2 edges between Yo 
and Xi colored with the double colors of C. Then, for coloring the edges adjacent 
to X 2 we cannot use the 1/2 double colors of the set C. Thus, 3Z/2 total colors 
are necessary. 

Consider now the following bipartite graph and partial coloring. There are 
edges between Xq and Yi which are colored with half the double colors 
and half the single colors. There also exist edges between Yq and X 2 which 
are colored with the double colors not assigned to edges between Xq and Yi and 
S/2 single colors. The I - edges between Xq and Y 2 are either uncolored 
or colored with double colors also assigned in edges between and X 2 . The 
I — edges between 1 q and Xi are either uncolored or colored with double 
colors also assigned in edges between Xq and Yi . Then for coloring the I — 
edges between X 2 and Yj we must use new colors. This means that I + total 
colors are necessary. 




Y_0 

Y_1 

Y_2 



Fig. 2. The bipeirtite graph for the lower bound. 



4 Problem B 

In this section we deal with problem B. We first give the lower bound. 

Consider an Z-regular bipartite graph G = {Vi,V 2 ,E) and let Ui 6 Vi and 
V 2 £ V 2 . Let S < 21. There are S/2 edges adjacent to ui but not to V 2 and 
S/2 edges adjacent to V 2 but not to vi. These edges are already colored with S 
colors. There also exist I — S/2 edges between and V 2 which must be colored 
with extra colors. Thus, I + S/2 total colors are necessary. 

In the following we outline the idea of the upper bound. First, the bipartite 
graph is decomposed into matchings. Let U be the set of matchings that do not 
contain any colored edge, and F be the set of matchings that contain at least 3 
colored edges. We can show the following claim which captures the most difficult 
part of the upper bound. 
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Claim. A matching of F with k colored edges can be colored together with 
matchings of U without using any new colors. 

Proof. We first show how a matching M3 of F with 3 colored edges can be colored 
together with a matching Mq of U without using any new color. We consider 
M3 and Mo together as a cycle cover of the bipartite graph. Wlog we assume 
that the cycle cover consists of one cycle that spans the entire bipartite graph. 
Let Cx,ey, Cz be the colored edges of Ms, colored with colors x, y, z respectively. 
Consider the path pi that connects edges Cj, and and does not contain Cx. 
We color the uncolored edges of pi using colors y and x alternatively. Coloring 
the uncolored edges of path p 2 between e* and and path p^ between Cx and 
Cy is similar. Obviously, we can color a matching of F with more than 3 colored 
edges together with a matching of U without using any new color. 

Now consider a matching Mk € F with fe > 5 colored edges and let Mo € U. 
Let C be the set of k edges of Mk which are already colored. For any subset C 
of C of cardinality at least 4, there exists at least a pair of edges Cx, Cy € (7 that 
are not adjacent to the same edge of Mo. Otherwise, there exists an edge e' £ C 
with at least 3 adjacent edges in Mo, a contradiction since Mo is a matching. So 
Mo can be colored with colors assigned to Cx and Cy. Iteratively, we can color 
matchings of U using 2[^^j colors of edges of Mk without using any new 
color. There are 3 or 4 (if k is even) colors in edges of Mk that were not used for 
coloring any matching of 17; so we can easily color the uncolored edges of Mk 
together with another matching of U. The claim follows. □ 

We use the claim above to group and color the maximum number of match- 
ings in U along with matchings in F. Then each one of the remaining matchings 
of U (if any) can be trivially colored with an extra color; similarly matchings 
with no more than 2 colored edges (but with at least one) can be colored without 
using any extra color. 

Let fci, ..., fe|f I be the number of colored edges in matchings of F. The number 
of uncolored matchings is 

S - k- 
\U\<l-\F\- ^ y - i ._L 

so the number of new colors (if any) will be 



|F| 



i=l 







s-T}^[ki 



\F\ 

E 

1=1 




< 





S 

2 



and the number of total colors does not exceed I -|- S/2. 
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Abstract. The “Constraint Bipcutite Vertex Cover” problem (CBVC 
for short) is: given a bipartite graph G with n vertices and two positive 
integers ki,k 2 , is there a vertex cover taking at most ki vertices from one 
and at most fcs vertices from the other vertex set of G? CBVC is VP- 
complete. It formalizes the spare adlocation problem for reconfigurable 
arrays, an important problem from VLSI manufacturing. 

We provide the first nontrivial so-called “fixed parameter” algorithm for 
CBVC, running in time 0(1.3999**‘''*'^ -I- (fci -t- k 2 )n). Our algorithm is 
efficient for small values of A;i and fei as occurring in applications. 



1 Introduction 

Nontrivial upper bounds for important iVP-hard problems by exact algorithms 
have excited a continuous interest for many years [2, 10-12]. With the advent of 
parameterized complexity theory [3] a special class of exact algorithms has be- 
come more and more important [4, 7] : As an example, consider the VP-complete 
Vertex Cover problem: given an graph G = {V, E) and a positive integer fc, is 
there a subset of vertices C C V of size |C| < k such that each edge in E has 
at least one endpoint in Cl Setting n := )V|, the best known exact algorithm 
for this problem has a running time of 0(1.211") [10]. There is, however, an- 
other exact (“fixed parameter”) algorithm solving Vertex Cover in running time 
0(1.29175*' -I- kn) [9]. For instance, already for k < n/2 it is much more efficient 
than the 0(1.211") algorithm. 

The fixed parameter algorithm for Vertex Cover mentioned above is valu- 
able from a practical point of view, where small values of k can often be as- 
sumed because of the application behind [3,4]. In this paper, we study an also 
VP-complete variant of the Vertex Cover problem, motivated by reconfigurable 
VLSI [6]: given a bipartite graph G = {Vi, V 2 , E) and two positive integers k\ and 
k 2 , are there two subsets C\ C V\ and O 2 C V 2 of sizes jOi] < k\ and IO 2 I < k 2 
such that each edge in E has at least one endpoint in Ci U O 2 ? The existence of 

* Peirtially supported by a Feodor Lynen fellowship of the Alexander von Humboldt- 
Stiftung, Bonn, and the Center for Discrete Mathematics, Theoretical Computer 
Science and Applications (DIMATIA), Prague. 
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two parameters and two vertex sets makes this problem, called Constraint Bipar- 
tite Vertex Cover (CBVC), quite different from the original one. Thus, whereas 
the classical Vertex Cover problem (with only one parameter!) restricted to bi- 
partite graphs is solvable in polynomial time (because it is equivalent to a poly- 
nomial time solvable maximal matching problem), by a reduction from Clique it 
has been shown that CBVC is VP-complete [6]. Here, to om: best knowledge, we 
give the first nontrivial fixed parameter algorithm for the Constraint Bipartite 
Vertex Cover problem nmning in time 0(1.3999*‘^"'‘*‘2 -f (fci -I- k 2 )n). We conjec- 
ture that, due to the different combinatorial structure in comparison with Vertex 
Cover, it should be very hard to get em exponential base close to the one there 
(1.29175'') [9]. 

Our result makes the following two contributions: First, it provides a further 
example (of which there are still few) for a problem with an efficient fixed 
parameter algorithm [3, 4, 7]. Second, our result is not only of theoretical interest, 
but is also valuable with regard to practical applications. This is also due to the 
fact that for the VLSI application behind it is very natural to assume small 
values for A:i and k 2 compared to the total number of graph vertices n. Hence, 
our exact algorithm with its proven upper bound may successfully compete with 
heuristic algorithms in use, as first implementation tests indicate. 

To achieve our result, we employ well-known methods from parameterized 
complexity [3]: reduction to problem kernel and bounded search trees. These 
have already been successfully applied for Vertex Cover [1,4,9], Maximum Sat- 
isfiability [8], and elsewhere [3]. The main technicEil contribution of om work is 
the development of a search tree of size 1.3999''* which requires numerous 
case distinctions based on combinatorial considerations that are very different 
from the classical Vertex Cover case. For all details missing in this extended 
abstract, we refer to the corresponding technical report'. 

2 Prelimineiries 

We assume familiarity with fundamentals from graph theory, algorithms, and 
complexity. We make use of the following notation for a graph G — (V,E): 
Writing G — X means that we delete vertex X and all its incident edges from 
graph G. By NX we denote the set of neighbors of X in G. By 5X we denote 
the degree of vertex X, that is, jWj. A graph is called r-regular if every vertex 
has degree r. In this paper, we only deal with bipartite graphs. For the ease of 
presentation, we treat them as two-colored (black and white) graphs with each 
vertex having a color opposite to all its neighbors. 

Om algorithm works recursively. The number of recmsions is the number of 
nodes in the according search tree. This number is governed by homogeneous, 
linear recurrences with constant coefiicients (cf. [9]). If the algorithm solves a 
problem of size k and calls itself recursively for problems of sizes k—di,... ,k—di, 
then (di, . . . ,dj) is called the branching vector of this recmsion. It corresponds 

' Technical Report KAM-DIMATIA Series 99-424, Faculty of Mathematics and 
Physics, Charles University, Prague, March 1999. 
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to the recurrence tk = tk-di H (- tk-df The characteristic polynomial of this 

recurrence is 



( 1 ) 

where d = max{di, . . . , di}. If a is a root of (1) with maximum absolute value, 
then tk is up to a polynomial factor. We call |a| the branching number which 
corresponds to the branching vector (di, . . . , di). Moreover, if a is a single root, 
then even tk — 0{a'°) and all branching numbers that will occur in this paper 
are single roots. In this paper, the size of the search tree is therefore 0{a'°), 
where k ki + is the parameter and a is the biggest branching num- 
ber that will occur; it is about 1.3999 and belongs to the branching vector 
(8, 9, 8, 10, 11, 10, 10, 11, 10, 10, 12, 11, 9, 10, 9, 10, 12, 11, 7, 9, 8, 9, 10, 9). 

3 The algorithm — overview 

Omr algorithm works in basically the same way as the fixed parameter algorithms 
for Vertex Cover do [1, 4, 9]. The main part is to build a bounded search tree: To 
cover an edge, we have to put at least one of its two endpoints into the (optimal) 
vertex cover sets. Thus, starting with an arbitrary edge, we can make a binary 
decision between its two endpoints. In eeich subcase, we delete the corresponding 
vertex chosen and its incident edges and repeat this until we have built a search 
tree of size As a consequence, it is easy to see that this lesids to an 

algorithm running in time 0(2*^"^^^n), where n denotes the number of vertices 
in the graph. All results (including ours) to get more efficient algorithms are 
based on efforts to shrink the search tree size. 

As in the classical case, we achieve a reduction of the search tree size by 
distinguishing between the degree of graph vertices. Since for CBVC we have 
to minimize with respect to two parameters, this gets significantly harder than 
in the classical Vertex Cover case. For instance, in the classical instance it will 
always lead to an optimal vertex cover taking the neighbor of a degree-l-vertex. 
Thus, a branching in the search tree is avoided. However, this is no longer possible 
in the CBVC case, because the neighbor belongs to the second vertex set in the 
bipartite graph and we have to minimize with respect to two vertex cover set 
sizes. In particular, the size of a minimal solution for CBVC is no longer uniquely 
determined, because the signature s (|Ci|, IC 2 I) of a vertex cover is a vector 
of numbers instead of simply a number. Our algorithm provides, however, for 
each minimal s a corresponding minimal solution. 

Before we give an overview of our algorithm, we still have to briefly explain a 
technique called reduction to problem kernel [3], which is a kind of preprocessing. 
This step is based on a simple observation eilready used by Kuo and Fuchs [6], 
which led to the “must-repair-analysis” pre-phase in their algorithms. Let G = 
{V\,V 2 ,E) be our given bipartite graph and k\ and ^2 be the constraints. Clearly, 
if a vertex in Vi has a degree greater than k 2 , then it has to be part of the vertex 
cover and, analogously, if a vertex in V 2 has a degree greater than k\, then it 
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also has to be part of the vertex cover. In this way, deleting all the “high-degree- 
vertices” together with their incident edges, we can infer that after reduction to 
problem kernel the size of the graph is at most 2k\k2- Obviously, reduction to 
problem kernel can be implemented to run in time 0((fci -t- k 2 )n). Combining 
this reduction to problem kernel with the trivial search tree of size we 

would end up with a time algorithm. In the rest of the 

paper, we describe how to shrink the search tree size from to 

Before we describe the overall structure of our search tree algorithm, let us 
briefly deal with an easy special case. Clearly, we can deal with each connected 
component separately. So, let us assume that the graph is connected. If the max- 
imum vertex degree of a graph is at most two, then CBVC is easy to solve: We 
know that, in this case, the graph is either a cycle or a path. In both cases, how- 
ever, it is fairly easy to compute the linear number of possible minimal vertex 
covers in linear time. We omit the basiccdly straightforward details. In addition, 
as previously mentioned, here we have to take into consideration that our given 
graph may be split into several connected components. Since the various com- 
ponents are “independent” of each other, we can simply combine them using 
component-wise addition and then again looking for the minimal values. Conse- 
quently, by using simple data structmres, this can be done in 0((A:i -|- ^ 2 )^) time, 
because 1 -|-min(fei, ^ 2 ) is an upper boimd for the signatures belonging to a com- 
ponent. Furthermore, we can assume that there are at most k\ -l-fea components, 
since otherwise the graph is not coverable and we know that each output of 
merging two minimal vertex covers always is bounded by 0(fci -H fc 2 ). As a result 
of this, we have (ki 4- ^ 2 ) merge steps each of time complexity 0{{k\ -f fc 2 )^). 
Summarized, this gives: 

Proposition 1. For bipartite graphs with maxiTnum vertex degree 2 CBVC can 
be solved in time 0{{ki + k 2 )^). 

By Prop. 1, in the following description of the basic structure of our search 
tree algorithm we may focus on graphs with maximum degree of at least three: 



Overall structure of the search tree algorithm. 

In principle, the algorithm recursively finds optimal vertex covers as follows. 
Given a bipartite graph G, we choose several subgraphs Gi, . . . ,G{ and compute 
optimal vertex covers for all of them. From them we can construct an optimal 
vertex cover for G. For example, let X be some vertex of G and let Gi be the 
subgraph that results from G by deleting X and all its incident edges. A vertex 
cover of Gi, together with X, is then a vertex cover of G. Moreover, if there are 
optimal vertex covers for G that contain X, then we can construct optimal vertex 
covers from an optimal vertex cover of Gi. Otherwise, if no optimal vertex cover 
of G contains X, they must contain all neighbors of X. Hence, let G 2 be the 
graph that results from G by deleting all neighbors of X. Again, we can construct 
a vertex cover of G by taking vertex covers of G 2 and adding all neighbors of X. 
If we start from optimal vertex covers for Gi and G 2 , then at least one of the 
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resulting covers for G must be optimal, since either X or its neighbors must be 
part of any vertex cover. 

However, we select the subgraphs Gi, . . . , Gi in a more complex manner and 
branch according to much more complicated sets. The rules how to select those 
branching sets are as follows. W.l.o.g. we assume that the graph is coimected. We 
distinguish between eight main cases (M1-M8), some of them requiring further 
subcases. More details on these appear in Section 4. It is of central importance 
for the correctness of our algorithm to execute the various steps in the given 
order — that is, we always choose an applicable step with minimal number: 

Ml. If there is a vertex X with degree at least 4, then branch according to X 
and NX. Branching vector and branching number; (1,4) and 1.3803. 

M2. If the graph is 3-regular, then pick any vertex X and branch according 
to X and NX. (This step has to be applied once at most and thus does 
not influence the algorithm’s asymptotic complexity. Similarly, if G contains 
only a small constant number of vertices of degree three, such a branch does 
not affect the overall time analysis.) 

M3. Deal with tails of size at least two. 

Branching vector and branching number; (2, 3) and 1.3248. 

M4. Deal with 4-cycles.^ 

Branching vector and branching number: (2,2) and 1.4143. 

M5. Deal with chains of length at least three. 

Branching vector and branching number: (3,4,3) and 1.3954. 

M6. Deal with degree-3-vertices with three neighbors of degree 2. 

Branching vector and branching number: (4, 6, 6, 7, 3) and 1.3954. 

M7. Deal with degree-3-vertices with two neighbors of degree 2. 

Branching vector and branching number: (6, 7, 4, 2) and 1.3993. 

M8. Deal with degree- 3- vertices with one neighbor of degree 2. 

Branching vector and branching nmnber: 

(8, 9, 8, 10, 11, 10, 10, 11, 10, 10, 12, 11, 9, 10, 9, 10, 12, 11, 7, 9, 8, 9, 10, 9) and 
1.3999. 

The steps above can be easily shown to provide a complete case distinction 
capable of handling all possible cases. More specifically, from this point of view, 
steps M3-M5 would even be superfluous — they are, however, necessary in order 
to get a small search tree size by handling “nice special cases” in advance. The 
harder cases shown above are M4, M6, M7, and M8. The worst case branching 
vector occurs in M8 and implies a search tree size 1.3999^^ 

In total, we obtain a CBVC algorithm running in time 

0(1.3999*^+*^ feika + (fci -1- k2)n + (fci -F fca)^). 

This can be improved to 0(1.3999*'*'''*'^ -|-(A:i-l-A: 2 )n) by simple asymptotic argu- 
ments. However, the factor k\k 2 above CEinnot only be omitted by “asymptotic 

^ Here, we only give the result obtained from a very simple analysis. A 
refined one is deferred to the full paper and yields branching vector 
( 6 , 7, 7, 7, 7, 9 , 9 , 9 , 9 , 8, 8, 10 , 10 , 10 , 10 , 12 ) and branching number 1 . 3996 . 
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tricks,” but also by an argiunent due to refined complexity analysis. The basic 
idea is that nearly all nodes in the search tree are near to the leaves and we can 
therefore expect that the running time is the size of the search tree times a small 
constant instead of times kik 2 - Observe that near to the leaves the graph to be 
processed is small. We omit any details and just note that the argument works 
in complete analogy as it does for Vertex Cover in [9]. 

4 Details of the algorithm 

Next, we present details of the main algorithm shown above, hence coming to 
cm assessment of the running time in the worst case. To this end, we introduce 
a counter k which will be initialized to ki + k^, i.e., the sum of the two input 
parameters bounding the size of the vertex cover. Each recursive call of the 
main procedure will decrement k in some way, so that k obviously bounds the 
depth of the search tree. Most conveniently, k is considered to be a parameter of 
the main algorithm, which can be czJled like m{G,k). Observe that due to the 
reduction to the problem kernel, G has O(fc^) vertices and 0{k^) edges. Since 
the main procedure works depth-first, the space requirement of the algorithm is 
only polynomial. Because of steps Ml and M2 (cf. Section 3), we can assume 
w.l.o.g. that the maximum vertex degree in the graph is three and the graph is 
not 3-regular. 



Case M3: tails. A tail consists of a degree-3-vertex A, followed by a (possi- 
bly empty) sequence of degree- two- vertices, ended by a degree- 1-vertex. If A is 
neighbored by a vertex of degree one (i.e., we have a micro-tail), we regard A 
as if it had degree two in the following analysis. Otherwise, a typical situation is 
depicted in Table 1, where dashed edges and grey vertices axe optional. Here, we 
encounter the main trick in our time zmalysis for the first time: we have seen that 
we can cope very easily with a graph having vertices of a degree at most two very 
easily (Proposition 1). Therefore, if we tcike vertex A in the present situation, we 
create a non-empty path component starting at the degree- 2- vertex B; in order 
to cover that component, we need at least one vertex firom that component in 
the cover. Although we do not know which vertex (black or white) to take into 
the cover, we can safely bound the seeu'ch tree by ceilling m{G — A, k — 2) and 
m{G-NA,k-i). 

Note that in Table 1, the column labeled “Branching” contains the informa- 
tion necessary to understand the branching analysis. The first subcolumn lists 
the conditions under which the analysis is valid; 0 indicates that no prerequisites 
are necessary. Then, the vertices taken in that branch are listed; here, first A and 
then (in the second row) its neighbors. The third subcolumn lists how many ver- 
tices will be needed at least for the cover in the final (after the complete search 
tree has been constructed) polynomial time analysis of degree-2-vertices. The 
fourth subcolumn gives the values which can be subtracted from the parameter 
k in the corresponding recursive call of the main procedure. Finally, the fifth 
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Table 1. Cjises M3-M5 

subcolumn gives an upper estimate for the base of the search tree size derived 
from this case, implying search tree size 1.3248*'. 

Case M4; 4-cycles. Consider the corresponding picture from Table 1. Why is 
the given case distinction complete? Assume two neighboring vertices, say A and 
B, are contained in the cover. Then, in order to cover the edge between C and D, 
either C ot D have to be in the cover, too. Therefore, either case AC or case BD 
treats a sub-cover of the proposed case, and we do not exclude to taJce vertices 
other than AC or BD, resp., into the cover. We mention without proof (see full 
paper) that a much more detailed emalysis of 4-cycles allows for a branching 
vector (6, 7, 7, 7, 7, 9, 9, 9, 9, 8, 8, 10, 10, 10, 10, 12) and branching number 1.3996. 

Case M5: chains of length at least 3. If two (not necessarily different) 
degree-3-vertices A and B are connected via a path of length i on which all 
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vertices (besides A and B) have degree two, we say that A and B are connected 
by a chain of length £. Chains of length 3 and 4 are considered in Table 1 (M5a 
and M5b). Should A and B be coimected by a longer chain, the following branch 
analysis will work; 



0 


AB 


2 


3 




0 


ANB 


1 


4 


1.3954 


0 


NA 


0 


3 





In that analysis, we even assumed that A = BotA&NB might occur. 

So, we can assume from now on that two degree-3-vertices axe connected 
by a chain of length at most two. Bearing this in mind, we start analyzing the 
possible situations in the neighborhood of a degree-3-vertex with at least one 
degree- 2-neighbor. 




Ceise M6: One degree-3-vertex with three degree-2-neighbors. In Ta- 
ble 2, the general situation is depicted in the first pictme. We can assume 
5A = 5B = 5C — 3, since otherwise there is either a tail (M3) or a chain 
(M5). 

Case M7: One degree-3-vertex with two degree-2-neighbors. Of course, 
since micro-tails at degree-3-vertices cure like degree-2-vertices, the degree-3- 
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vertex in question has one degree-3-neighbor called A. We refer to the second 
picture in Table 2. As to the case \NB D NC\ = 1, we mention without proof 
that a worst case branching vector (6, 7, 4, 2) with ^ranching number 1.3993 is 
attainable. 




Table 3. Case M8a: SAi = 2 



Case MS: One degree-3-vertex with one degree-2-neighbor. Consider 
the picture in Table 3. Each of both neighbors X, Z of the degree- 2- vertex Y heis 
two degree-3-neighbors (A,B resp. C,D), otherwise we are in case M6 or M7. 
4-cycle-subgraphs have been treated in case M4, so that we can assume that A, 
B, C and D are pairwise different, and that NA D NB = 0 and NC fl ND = 0. 

Now, we can distinguish cases making assumptions on the degree of Ai- If 
5A\ = 1, we can refer to case M7, since X has “almost” two degree-2-neighbors, 
due to the fact that there is a micro-tail at A. 

Case M8a: 5Ai = 2. First, assume that 6 A\ — 2. Then, we can assume that 
A\ has degree three, since otherwise A would have two degree-2-neighbors (M7). 
Further, we can assume 5 A 2 = 3, since otherwise there would be either a tail 
or a chain starting at A (M3 or M5). Finally, if A[ E NA 2 01 B = A^ or 
B E NA'i , then we have a 4-cycle-substructure, see M4. Two special cases occur: 
li C — A 2 OT D = A 2 , then we have a 6-cycle with two degree- 2- vertices. If 
C E NA[ OT D E NA[, then we have a 6-cycle with one degree-2- vertex. We 
defer the proofs to the full paper and wish only to mention that branching vec- 
tors (6, 7, 6, 9, 6, 8, 3) (branching number 1.3930) and (6, 7, 7, 9, 6, 8, 3) (branching 
number 1.3829) can be achieved. In the main case treated in the picture, we can 
hence assume {C,D} fl {NA[ U NAi) = 0. These assumptions are only in part 
repeated in Table 3. 

Case M8b: 5Ai = 3. In fact, through symmetry, we can now assume that all 
neighbors of A, B, C, and D have degree three. The corresponding pictme 
is given in Table 4. Since we can again assume that a 4-cycle is not a sub- 
graph of our structure, each of the vertex sets {A,B,C,D}, {Ai,A'i,Bi^B[}, 
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Table 4. Case M8b: 5A\ = 3 

{Ci,C[,Di,D[} contains four elements. The special case {Ai, A'^, Bi, B[} 
{C\,C[,Di,D[} 0 yields branching vector (5, 7, 5, 6, 7, 6, 8, 9, 9) (branching 

number 1.3982) and the special case {NAi U NA[) PI {NBi U NB[) ^ 0 yields 
the overall worst case breinching number 1.3999 corresponding to the branching 
vector (8, 9, 8, 10, 11, 10, 10, 11, 10, 10, 12, 11, 9, 10, 9, 10, 12, 11, 7, 9, 8, 9, 10, 9). 

Both special cases are deferred to the full paper. These assumptions are 
not repeated in Table 4. (Horizontal lines in the branching list should help to 
structure that branching; they do not separate different branching lists.) 



5 Conclusion 

We have presented the first nontrivial upper bound for CBVC, an algorithm 
running in time 0(1.3999''^‘'‘*'^ + (fci + k 2 )n). Since it is exponential only in 
the (usually small) parameters fei and ^ 2 , it is of high practical interest and 
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contributes to the list of “efficient” fixed parameter algorithms [7]. A first, still 
unsophisticated version of oinr algorithm has been implemented and delivers 
promising results. As to future work, on the one hand, a competitive implemen- 
tation of our algorithm still remains to be achieved. On the other hand, there 
is a variant of CBVC with three instead of two parameters, motivated by re- 
configurable programmable logic arrays [5]. This problem deserves investigation 
similar to that undertaken for CBVC. 
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Abstract. In this paper, we introduce the problem of computing a min- 
imum edge ranking spanning tree (MERST); i.e., find a spanning tree 
of a given graph G whose edge ranking is minimum. Although the mini- 
mum edge ranking of a given tree cam be computed in polynomial time, 
we show that problem MERST is NP-hard. Furthermore, we present 
an approximation algorithm for MERST, which realizes its worst case 
performance ratio -A where n is the number of 

vertices in G and A* is the maximum degree of a spanning tree whose 
maiximum degree is minimum. Although the approximation algorithm 
is a combination of two existing algorithms for the restricted spanning 
tree problem and for the minimum edge ranking problem of trees, the 
analysis is based on novel properties of the edge ranking of trees. 



1 Introduction and Preliminaries 

Let G — {V, E) be an undirected graph, which is simple and connected. An edge 
ranking of a graph G = {V,E) is a labeUng r: E—^Z'^, with the property that 
every path between two edges with the same label i contains an intermediate 
edge with label j > i. By definition, every edge ranking r has exactly one edge 
with the largest rank. An edge ranking by integers 1,2,... , A: is called a k-edge 
ranking. A graph G is said to be k-edge rankable if it has a fc-edge ranking. An 
edge ranking is minimum if the largest rank k in it is the smallest among all 
edge rankings of G; such k is called the minimum edge rank of G and is denoted 
by rank(G). The minimum edge ranking problem (MER) asks to compute a 
minimum edge ranking of a given graph G and formally defined as follows: 

MER (minimum edge ranking problem) 

Input: A simple undirected graph G — (F, E) which is connected, and a non- 
negative integer k. 

Question: Is G fc-edge rankable (i.e., does G satisfy rank(G) < k)? 

In the aSirmative case, such a rank(G)-edge ranking is also required. Fig. 1 
is an input graph G = (V, E) with E={ei, C 2 , . . . , eg}, while (b) and (c) give 5- 
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and 4-edge rankings of G, respectively. Note that (c) in fact gives a minimum 
edge ranking of G, i.e., rank(G) = 4. 




(a) A graph G (b) An edge ranking of G (c) A minimum edge ranking of G 
Fig. 1. Edge rankings of a graph. 



MER has applications in the context of assembling a multi-part product from 
its components in the smallest number of parallel processing (integration) stages 
[1,7, 12]. It is known that MER is in general NP-hard [9], but it can be solved 
in polynomial time when the graph is a tree [2, 8, 12]. 

In this paper, we newly consider the following problem, which resembles MER 
but is essentially different. Given a simple undirected graph G = (V, E) which 
is connected, we repeat contraction steps imtil all the vertices are contracted 
into a single vertex. Here one contraction step consists of many simultaneous 
contractions of edges which do not share any of their end vertices. In this process, 
all self-loops created are simply ignored. Under this setting, we like to minimize 
the number of steps required before contracting all vertices into one. 

It is easy to see that this problem is equivalent to finding a spanning tree 
of G whose edge ranking is minimum. To see this, first assume that a spanning 
tree of G as well as its edge ranking is given. Then no two edges with the same 
rank share their end vertices, and in the i-th step all edges with rank i can 
be contracted simultaneously. Thus the required number of steps is equal to 
the largest rank in this edge ranking. Conversely, given a series of steps that 
contracts G into a single vertex, we assign rank i to all the edges contracted in 
the i-th step. Then it can be seen that G contains a spanning tree whose edge 
ranking is defined by the above ranks. 

Now we call this problem the minimum edge ranking spanning tree problem 
(MERST). 

MERST (minimmn edge ranking spanning tree problem) 

Input: A simple undirected graph G = (U, E) which is connected, and a non- 
negative integer k. 

Question: Does G have a fe-edge rankable spanning tree (i.e., does there exist 
a spanning tree T = (U, Et) of G with rank(T) < k)? 

Similarly to MER, in the affirmative case, such a spanning tree T as well as 
its fc-edge ranking is also required. We say that T is a minimum edge ranking 
spanning tree of a graph G if T is a spanning tree of G having the minimum 
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rank(T) among all spanning trees of G. Fig. 2 (a) gives an example of a minimum 
edge ranking spanning tree of the graph G in Fig. 1 (a), together with its edge 
ranking. Fig. 2 (b) demonstrates the contraction steps defined by the T. 




Fig. 2. (a) A minimum edge ranking spanning tree T of the graph G in Fig. 1 (a), and 
(h) its edge contraction according to the minimum edge ranking spanning tree. 



Problem MERST can be found in many practical apphcations. Among them, 
we pick up here one example found in relational database theory. 

Let us consider the “query graph (join graph)” [6, 11], where its vertex set 
corresponds to the set of relations and its edge set represents the pairs of re- 
lations which are joined. In this context, join operations which are represented 
by non-adjacent edges can be joined in parallel, but no two join operations cor- 
responding to adjacent edges can be performed simultaneously, since a relation 
can participate only in one join operation at a time. The join operations are then 
performed until all the relations are merged into a single relational table. The 
whole process can be formulated as MERST. 

In this paper, we show that MERST is NP-hard, and present an approxima- 
tion algorithm for MERST, with its worst case performance ratio min{(^* - 
1) logn/A* , A* — l}/(log(zl* -Hi) — 1), where n is the number of vertices in G 
and A* is the maiximum degree of a spanning tree whose maximum degree is 
minimum. This algorithm is a combination of two existing algorithms for the 
minimum degree spanning tree problem (MDST) and for the minimum edge 
ranking problem for trees (MERT), respectively. 

The rest of the paper is organized as follows. Sect. 2 shows the NP-hardness 
of MERST, and then Sect. 3 presents a simple approximation algorithm for 
MERST. In Sect. 4, we analyze our algorithm by deriving lower and upper 
bounds on the edge ranking of trees. Finally, Sect. 5 summarizes the paper. 

2 NP-hardness of MERST 

In this section, we show that MERST is intractable after showing several lemmas 
on the edge ranking. The idea of om: proof is based on the NP-hardness proof 
of the connected size-fc-paxtition problem for planar bipartite graphs [3]. We 
assume that a graph G = (V,E) is simple, undirected and connected. For a 
vertex set W QV, G\W\ denotes the subgraph of G induced by W. 
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Lemma 1 . Any connected graph G with rcink(G) = k has at most 2'° vertices. 

Let us introduce some additional notions related to minimiun edge rank- 
ing spanning trees. For a graph G — (V, E) and a positive integer k, a size-k- 
partition (this notion is referred to as fc-partition in [3]) of V is a (IVl/fc) -tuple 
(Fi, Vb, . . . , V\v\/k) and F = Fi U 1^2 U ■ • • U V|v|/fe, K n V} = 0 for all i ^ j 
such that |Vi| = fc for i = 1,2, , |F|/A:. Each Vi is called an element of the 
partition. A size-A:-paxtition of V is connected if the graphs G[Vi] are connected 
for all i. Let G = (V, E) be a graph with | F| = 2*, where A: > 0. We say that G 
has a nested partition if it recursively satisfies one of the following conditions; 

(i) fc = 0, or 

(ii) G has a connected size-2* “^-partition (^ 1 , 1 ^ 2 ) such that both G[Vi] and 
G[V 2 ] have nested partitions. 

Lemma 2. Let G — {V,E) be a graph with IFI = 2* (A: > 0). Then G has a 
k-edge rankable spanning tree if and only if it has a nested partition. 

This lemma provides the essential idea of NP-completeness proof of MERST, 
i.e., to find a A:-edge rankable spanning tree of G is equivalent to find a nested 
partition of G. 

Theorem 1 . MERST is NP-complete. 

Outline of the proof. Given a spanning tree T = (F, Et), we can check whether 
T is A:-edge rankable in polynomial time [8]. Hence, MERST belongs to NP. 

To prove the completeness, we reduce the following NP-complete problem [5] 
to MERST. 

3-Dimensional Matching (3DM) 

Input: Disjoint sets X = {xi,X 2 , ... ,x„}, Y ^ {?/i,?/ 2 , ••• ,J/n}, Z = {zi,Z 2 , 
... ,z„} and a set of triples S — {sj = (sji, Sj 2 , Sjs), | Sji S X,Sj 2 € Y,Sjz S 
Z, j = 1,2,... ,m}. 

Question: Does S contain a matching M (i.e., \M\ — n and every element of 
W = X \JY Z IS covered by exactly one triple of M)? 

An instance I of 3DM is depicted in Fig. 3 (a) as a bipartite graph with 
vertex sets S and W, where |51 = m and |W| = 3n. It is asked to find a set of 
n vertices in S that covers all the vertices in W exactly once. Without loss of 
generality, we can assume that n -|- m = 2^ holds for some integer i. 

Let us now construct the instance of MERST from an instance of 3DM (i.e., 
X, Y, Z and 5). Let X* = | Xi e X, g = 1,2,3}, T* = {y\'^^ \yi&Y, q = 

1,2,3}, Z*-{z,(®) |zi€Z, g=l,2},and S* ^ {sf^ I Sj S 5, p = 1, 2, . . . ,8}. 
Define an input graph G = (V, E) by 

6 

F-X*uy*uz*u5*, E=\jEh, 

h=l 



( 1 ) 
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where 



El = | Xi G X}, | Vi S Y}, 

E3 = {(zf \ I z, G Z}, = {{sf, \sjES, 1,2,..., 7}, 

Eb = ) I = (Sil,Sj2,Sj3) G S}, 

Ee^{{sf\sP) I sj,speS}. 



Reccill that, in the instance of 3DM in Fig. 3 (a), a triple sj = (sji, sj 2 , sjs) € 
S forms a claw including three vertices s^i, Sj 2 and Sjz G W, where sji, sj 2 and 
Sj 3 axe respectively represented by a doubled circle, a doubled square and a 
doubled triangle. We refer to these three vertices in X U Y U Z as matching 
vertices and the last vertex Sj G 5 as a tuple vertex. In our transformation of 
(1), we replace each claw Sj = {sji,Sj 2 ,Sj 3 ) into a comb as shown in Fig. 3 (b). 
A comb consists of sixteen vertices, where q = 1,2,3, q — 1,2,3, and 
q — 1,2, correspond to the matching vertices Sj\, Sj 2 and sj 3 in a claw, 

respectively, and the remaining eight vertices p = 1,2, . . . ,8, correspond to 
the tuple vertex sj in a claw. Note that the comb corresponding to sj may share 
the vertices s^^ with those corresponding to Sj> if Sjh = sj’h' holds for some 
h and h'. All combs are completely connected by edges in Eq. Also note that 
I Y| = 8m + 8n = 2^(m + n) = 2^"''® holds by the assumption that n + m = 2^. 
Thus, we can let h = £ + 3. (See Fig. 4 for intuitive comprehension). 




Fig. 3. (a) An instance of 3DM, and (b) transformation from a claw in 3DM to a comb 
in MERST. 



After this transformation we can prove that S has a matching M if and only 
if the resulting graph G has a nested partition (i.e., it has a fe-edge rankable 
spanning tree T). □ 

3 An Approximation Algorithm for MERST 

Since MERST is NP-hard, we propose in this section an approximation algo- 
rithm, which is a combination of two existing algorithms for the minimum degree 
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X 2 yi y 2 21 22 




Fig. 4. TVansformation from set of triples S of 3DM to the input graph G of MERST. 



spanning tree problem (MDST, defined below) and for the minimum edge rank- 
ing problem of trees (MERT, which is MER whose input graphs are restricted 
to be trees). We also state its approximation ratio here, but detailed analysis of 
the algorithm will be given in Sect. 4. 

MDST 

Input: A graph G = (V, E). 

Output: A minimum degree spanning tree T = {V,Et) of G; i.e., a spanning 
tree T of G whose mtiximum degree is minimum. 

We denote the maximum degree of vertices in a graph G by Aq, and the max- 
imiun degree of the minimum degree spanning tree T of G by A* {— At). 
Although MDST is known to be NP-hard [5], Purer and Raghavachari [4] devel- 
oped a polynomial time approximation cilgorithm which computes a spaiming 
tree T satisfying 



A* < At < A* -f" 1 {< Aq). (2) 

Our approximation algorithm for MERST first computes a spanning tree 
^Approx of G satisfying (2) (by using the algorithm in [4]), and then computes 
its minimum edge ranking. Recall that MERT is polynomially solvable (e.g., [8]). 
Thus, oin: algorithm described below can be executed in polynomial time. 

Algorithm ApproxJMERST 
Input: A graph G = {V,E). 

Output: A spanning tree T of G and its edge ranking r. 

Step 1: Compute a spanning tree TApprox of G satisfying (2). 

Step 2: Compute a minimum edge ranking r of TApprox- 

Step 3: Output T = TApprox and its edge ranking r. □ 




404 Kazuhisa Makino, Yushi Uno, and Toshihide Ibaraki 



Theorem 2. For a graph G = {V,E) with IV] = n, let T^in denote a minimum 
edge ranking spanning tree ofG, and let T Approx denote a spanning tree ofG com- 
puted by algorithm Approx_MERST for the input G. Then, the approximation 
ratio of algorithm Approx_MERST can be bounded from above by 

rank{TApprox) ^ min{(^* - l)logn/^*,^* - 1} 

rank(TMin) “ log(^* + 1) - 1 

where A* is the maximum degree of the minimum degree spanning tree ofG. □ 

4 Analysis of Edge Ranking of Trees 

In this section, we discuss some properties of the minimum edge ranking of trees 
in order to prove the approximation ratio of algorithm Approx_MERST. In 
particular, we derive upper and lower bounds on rank(T) of a tree T = (V, Et) 
in terms of the number of vertices n—\V\ and its maximum degree At- 



4.1 Lower Bound on the Edge Rank of Trees 

Lemma 3. For any tree T = (V,Et), rank(T) > max{zlT, [logn]} holds, 
where At is the maximum degree of vertices in T and n= |V|. 

Both lower boimds At and [logn] eire tight, that is, there exist trees Ti and 
T 2 such that rank(Ti) = At^ and rank(T 2 ) = [logn], respectively. 



4.2 Upper Bound on the Edge Rank of Trees 

Lemma 4. Let T = (V, Et) be a tree with |F| = n. Then it holds that 

rank(T) = [logn] if ^r = 0,1,2 (3) 

Recall that a tree T is a line when At = 0, 1, 2. Therefore, equality (3) of 
Lemma 4 is clear since we have rank(T) = [log n] in these cases. The proof of 
general inequality (4) is the main goal of this section, and its outline will be 
given below. This lemma, together with Lemma 3, proves Theorem 2, since the 
algorithm of Fiirer and Raghavachari [4] can find a spanning tree T of G such 
that A* < At < .4* + 1 in the first step of Approx_MERST. 

Edge Ranking of a Tree by Top-down Approach For the purpose of 
proving (4), we present an algorithm which gives a consistent (but not always 
minimum) edge ranking of trees. 

All the existing exact algorithms for MERT [2, 8, 12] are based on the bottom- 
up approach: Choose an arbitrary vertex as a root (which is placed at the top) 
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and assign ranks from leaf edges to top edges in the resulting rooted tree. How- 
ever, we describe here a top-down edge ranking algorithm, which may not give 
exact minimum solution but is easy to analyze. 

It starts from the given tree T, and in each step, removes one edge from a 
generated subtree of T to spht it into two, until all the generated subtrees become 
singletons. This process can be visualized by the partition tree as illustrated in 
Fig. 5. 




Algorithm Edge_Ranking_of.Trees (ERT) 

Input: A tree T = {V,Ex)- 
Output: An edge ranking of T. 

Step 1: Start with the partition tree consisting of exactly one component T. 

Step 2: If there is a subtree T', which has more than one vertex and is 
located in a leaf position of the cmrent partition tree, choose an edge e in T' 
according to the criterion to be described later and remove it after giving it 
temporal ranking r'{e) — i, where i is the depth of the chosen subtree in the 
partition tree. The two subtrees resulting from the removal of e become the 
children of T' in the partition tree. Retiurn to Step 2. On the other hand, if there 
is no subtree satisfying the above condition, then go to Step 3. 

Step 3: Let h be the height of the resulting peirtition tree, and rank all edges 
e by r(e) = h — r'(e). □ 

It is easy to see that this algorithm in f8w;t gives an edge ranking. The main point 
of this algorithm is how to determine em edge e to be removed from T' in Step 
2. Our criterion selects an edge, which makes the resulting two subtrees “most 
balanced” in the sense that the difference of their sizes is smallest. Formally, 
such an edge is defined as follows. 

Definition. For a tree T — {V,Et), a partition (Vi, V 2 ) of V is called connected 
if the induced subgraphs T[Vi] and T[V 2 ] are both connected (i.e., subtrees of 
T). A connected partition (Fi, V 2 ) is called optimal if the difference of their sizes 
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1 1^1 1 — 1^2 1 1 is m inimum. An edge is ceilled optimal if its removal produces an 
optimal connected partition. 

In order to characterize optirnd edges, we give some more definitions. 

Definition. Given a tree T = (V,Et), the weight w{v) of a vertex u € F is 
the size of the subtree, which has the maximum number of vertices among all 
subtrees in T[V \ {v}], where T[V \ {u}] denotes the subgraph of T induced by 
y \ {v}. The centroid of a tree T is the set of all the vertices having the minimum 
weight. A vertex in the centroid is called a centroid vertex. 

Now, take a vertex uq € V of a tree T = (V, Et), and consider that T is rooted 
at vq. Assume that uq bas b = deg(uo) children vi,V2,. ■ . ,Vb- If we remove vq 
from T, then there remain b subtrees Ti,T2, ... ,Tb, where each Tj is rooted at 
Vj. We index these Tj in the nonincreasing order of their sizes (i.e., the number 
of vertices). By definition, |Ti| > IT2I > • • • > |T(,|, and hence = |^i| holds. 

Lemma 5 . Let T = (V, E) be a tree with n= | V|. Then vq is a centroid vertex 
ofT if and only if w{vq) < nj2 holds. Furthermore, any tree T has either one 
centroid vertex vq or two centroid vertices vq and v\ which are adjacent to each 
other. In the latter case, w{vo) = = n /2 holds. 

In the subsequent discussion, we choose vo and vi so that vq is the centroid 
vertex if there is only one centroid vertex (in this case, vi is the root of subtree 
Ti), and vq and vi are both centroid vertices if there are two centroid vertices. 

Lemma 6. For a tree T, let vo and v\ he defined as above. Then {vq,vi) is an 
optimal edge ofT. 

Now we present the following criterion. 

Criterion in Step 2 of ERT: Choose the optimal edge (uo,Ui) in the subtree 
under consideration. 



Analysis of Algorithm ERT Let 7 = 



{At — 2) logn 



In this subsection, we 



log At — 1 

prove the inequality ( 4 ) of Lemma 4 ; i.e.. Algorithm ERT always gives a 7-edge 
ranking to a tree T = (F, Et), where n = | V] and we assume At > 3 . Note that 
7 is non-decreasing in At as can be proved by direct calculation. 

The proof is done by induction on n. 

In case of n = 1 , Lemma 4 clearly holds. Assuming that Lemma 4 holds for all 
trees T' with n' (<n) vertices and its maximum degree At' (< At), we consider 
the case of n vertices. 

For an optimal partition (Fi,!^) of a tree T = {V,Et), the subtree which 
includes the centroid vertex uq of T is called the main subtree. By the definition 
of Vq, the main subtree is always greater than or equal to the other subtree. 

Now, given a tree T = — (V(o,o)j ^(o,o))> we recmsively define its vertices 

v['’^ and subtrees = (F(i,o),£?(i.o)) and = (V(i,i)i ■E'(i,i)) for i - 
1,2,..., as follows: (i) is an optimal edge in g ^ jg 
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centroid vertex of T^' (ii) = T[V(i_o)] and = T[V(i4)], where 

(V(i,o)) ^(t,i)) i® optimal partition of V(i_x,o) obtained by the deletion of edge 
(no*\ such that V(i_o) is its main subtree, i.e., Vq^ G 

Note that V(i_i,o) == V(i,o) U V(i,i) and £^(i_i,o) = -E(i,o) U £7(i,i) U 
hold for all z (> 1). The above condition (ii) implies that 

|T(i.0)| > IJ’li,!)! (5) 

for all i. Of comrse, this partition process of T constitutes a part of the 

partition tree of ERT, as shown in Fig. 5. 

In the above partition process, let ax be the positive integer such that = 
and if such exists, and a 2 be the 

positive integer such that 2|T^‘d)| < |7’(*>o)j for all z = 1, 2, . . . ,0:2 — 1 and 
2|j'(“2.i)| > |7’(«2,o)| Notice that czx > 1 and 0:2 > 1 hold by definition. Now 
define 



a = min{ax,a2}. 

Let vq = Vq^K Since vq = Vg^ holds for z = 1, 2, . . . , a, we have a < deg(vo)- 
Let us consider the case of a = deg(uo); he., consists of a single vertex 

Vq. In this case, since vq = Vg“^ is a centroid vertex of |j'(“.i)| < 

|j(a-i,o)|/2 holds by Lemma 5. By |T(“-bO)| = |T(“-0)| + and |r(“-0)| = 

1, we have = 2. However, since h holds 2|T^““^’^)| > 

|T(“~1’0)| (= 2), contradicting the definition of a. Hence, we have 

a < deg(z;o) - 1 (< Zlr - !)• (6) 

Claim: Algorithm ERT can give at most a (7 — z)-edge ranking for rb>^) (= T,), 
z = 1, 2, . . . , a, and a (7 — a)-edge ranking for 

Note that this will complete our induction, since Algorithm ERT then gives 
the ranks, which are at most 7 + 1 — z, to the remaining edges (z;o'\ Ux’^), z = 
1, 2, . . . , a. Before showing our claim, we require the lemma which estimates the 
sizes of 

Lemma 7. Letvg, i — . . ,a andT^^’^^ be defined as above. Then 

< |T|/(z + 2) z = l,2,... ,a-l (7) 

|r(«,i)| < ir|/(cx + i) (8) 

< 2|T|/(a + 2). (9) 

hold. Furthermore, if a = deg(z;o) — 1, we have 

|y(«,o)| < (|T|+a)/(a + l). 



( 10 ) 
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Now we are ready to prove the above claim. 



First, let us consider the edge ranking of T^’d), i = 1, 2, . . . , a. By Lenuna 7, 
I rC*.!) I < |T| /(ffl) holds for all i = 1, 2, . . . , a. By the induction hypothesis, Claim 



-2 



> 






is proved by showing 7-i > log^T - i ^ 

holds by the monotonicity of 7 in At, it suffices to show 7— i > , 



which is equivalent to /i(i) {At — 2)log(i + 1) — i (log^x ~ 1) > 0- Since 

1 < i < At - 1 holds by (6), and its derivative f{{i) = {At - 2)loge/(i + 1) - 
(log At — 1) is monotone decreasing in this range, it is sufficient to show that 
/i(l) > 0 and fi{AT — l) > 0. Actually, /i(l) = fi{AT — l) — At — log At -I > 

2 — log 3 > 0, and thus Algorithm ERT gives a (7 — 2')-edge ranking for 



1 = 1, 2 , . . . , a. 

Let us next consider the edge ranking of Since a < At — 1 by (6), we 

divide it into two cases: (a) a < At — 2 and (b) a = At — 1- 

Case (a): we have < 2lT|/(o; + 2). By the induction hypothesis, we 

shall show 7— a > _ gy monotonicity of a, it suffices to 

show 7-a > Case (b): > 2 holds by (6). If = 

2 or 3, then Algorithm ERT gives 1- or 2-edge ranking of respectively. If 

|j'(“.o)| > 4^ it follows from (10) that |T(“>®)| < (|T| -|- a)/{a + 1) holds, and by 

the induction hypothesis, we shall show 7 - a > In 

every case, we can prove Claim by a direct case analysis in a manner similar to 
the case of 



Consequently, the proof of the clmm is completed. 



4.3 Tight Examples of Algorithm ERT 

Let denote a tree in which all the inner vertices have the same degree d 
and there exists a vertex vq such that the distances between vq and all the leaves 
are exactly h. This T(^d,h) attains the upper bound of Lemma 4. 

Lemma 8. Let d and h be integers such that d > 3 and h > 2. Then, T(d,h) 

satisfies rank(T(d,h)) > 

5 Conclusion 

In this paper, we introduced the minimum edge ranking spanning tree problem 
(MERST), and proved that MERST is NP-haxd, but it has a simple approx- 
imation algorithm, whose approximation ratio is min{(Zi* — l)logn/^*,A* — 
l}/(log(4* -I- 1) — 1), where n is the number of vertices in a given graph and A* 
is the maximum degree of a spanning tree whose maximum degree is minimum. 

Some issues remain for future work. One issue is finding special classes of 
graphs, in which MERST is polynomi2dly solvable. Indeed, we could show that 
MERST is polynomially solvable for the class of threshold graphs [10]. 
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Another issue is to consider the minimum vertex ranking spanning tree prob- 
lem (MVRST), which is the vertex version of our problem. Although this problem 
seems to be as hard as MERST, its complexity is still open. 
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Abstract. The “baseball elimination problem” is a classic problem which 
has already been considered from the computational point of view in the 
1960’s. At some stage of the baseball season, there is a set of games 
which have already been played and there is another set of remaining 
games. The problem consists in determining for a given team whether 
or not they are already “eliminated”, i.e., whether they can no longer 
become champions. Early solutions proposed a network flow approach 
which resulted in polynomial time algorithms. The interest in this kind 
of elimination problem was recently revived by Wayne [4] who proved 
an interesting threshold property which allows one to compute all elim- 
inated teams simultaneously. Namely, there is a constant W such that 
a team is eliminated if and only if it can no longer obtain W* or more 
points. Wayne also describes an algorithm for computing the threshold 
W* in polynomial time. Gusfield and Martel [2] have generalized the 
proof of the existence of a threshold to a more general setting which 
includes European football, where the “3-point-rule” is in effect, i.e., 3 
points are awcirded for a win and 1 point is awarded for a tie. 

In this paper, we show that determining the elimination of a European 
football team under the 3-point-rule is A/’P-complete. As a consequence, 
the generalized threshold result of Gusfield and Martel is of no use for the 
European football system since computing the corresponding threshold 
value is hard if P / We also show that the elimination problem is 
still A/^'P-complete if all teams have at most three remaining games each 
while the problem can be solved in polynomial time if each team has at 
most two remaining games. 



1 Introduction 

A football/basketbcill/handball season consists of a set {!,... ,N} of teams 
which have to play against each other. For every game between teams i and 
j, points are awarded according to some rule. Let us denote the “(a, /3)-rule” to 
mean the following: If a game is won by one team, the winner gets a points and 
the losing team gets 0 points. If a game is a tie, then both teams get /3 points 
each. European football leagues are cmrently played under the (3, l)-rule, which 
we also refer to as the “3-point-rule”. 



M. Kutylowski, L. Pacholski, T. Wierzbicki (Eds.): MFCS’99, LNCS 1672, pp. 410-418, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 
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In (American) baseball, ties are not possible, i.e., every game has a winner 
which is awarded one point. We will refer to this system as the “1-point-rule” . 

At some stage of the season, eaeh team has played a number of games and 
there is a number of remaining games. A team i is called “eliminated” if for 
all possible outcomes of the remaining games, there is at least one team which 
has more points than team i. Thus, ehminated teams are the ones which can 
no longer become champions. Correspondingly, we say that a team i “can still 
become champions” if there are outcomes for the remaining games such that no 
team has more points than team i. 

The “baseball elimination problem” (which was already studied to some ex- 
tent in the 1960’s) is the problem of determining whether a given team i is ehm- 
inated imder the 1-point-rule. Correspondingly, we can ask for the 3-point-rule 
whether a given team is already eliminated. We wiU refer to this problem as the 
“European Football EUmination Problem” (or EFEP for short ).^ 

The motivation for considering these two problems is of course that they are 
coming from a “real-hfe application” and that they are also interesting from a 
combinatorial point of view. 

The baseball elimination problem can be solved in polynomial time by net- 
work flow algorithms (see e.g. [1] and [4] for a more complete survey on the 
literature). Variations on the baseball ehmination problem have also been con- 
sidered, e.g., it was recently shown by McCormick [3] that it is A/^'P-complete, 
given t, to determine whether a given team can still achieve a position among 
the top t teams. 

On the other hand, it is surprising that no corresponding results for the 
European Football Elimination Problem have been found so far. The results 
in this paper close this gap. The main result is that the European Football 
Elimination Problem under the 3-point-rule is an A/^'P-complete problem. It 
also follows from oiu: proof that the restricted version of EFEP, where every 
team has at most three remaining games, is A/^T^-complete as well. On the other 
hand, we show how EFEP can be solved in polynomial time if every team has at 
most two remaining games. We only remark that our results can be generalized 
to other (a, /3)-rules. 

The interest in sports elimination problems was recently revived by Wayne 
[4] who investigated the complexity of determining the set of all eliminated teams 
simultaneously (under the 1-point-rule). The naive method computes this set by 
determining for every team whether or not they are eliminated. Wayne showed 
that there is a surprising property which facilitates the computation of the set of 
all eliminated teams. Namely, he showed that given a list of played and remaining 
games, there is a constant W* such that the following holds: For all teams i 
(which have Wi points and Qi remaining games), it holds that team i is eliminated 
if and only if Wi -|- 5, < W*. Wayne also showed how this threshold value W* 
can be determined by a single preflow-push maximum flow computation on a 
network. Gusfield and Martel [2] have extended the existence result to the 3- 

We define EFEP in such a way that its output is “yes” if a team can still become 
champions and “no” otherwise. 
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point-rule, i.e., they have shown that there exists a corresponding constant W*. 
It follows from our results that knowing about the existence of such a threshold 
value W* is useless for EFEP, as W* is hard to compute, unless V — AfV. 

Finally, let us notice the following: a few years ago, the European football 
leagues were played under the (2, l)-rule. The solution to EFEP under the (2, 1)- 
rule was simple: Since the number of points awarded to teams in one game was 
a constant (namely, 2), independent of the outcome of the game, it could be 
regarded as a special instance of the baseball elimination problem. Namely, one 
gEime between teams i and j under the (2, l)-rule could be seen as two games 
between teams i and j under the 1-point-rule (without ties). Hence, EFEP 
under the (2, l)-rule can be solved by network flow algorithms and it is also 
amenable to the new result of Wayne. Our paper has the consequence that for 
the newly introduced (3, l)-rule, it is much harder to decide whether a team can 
still become champions. 



2 Preliminaries 

Assume that an instance of the European Football Elimination Problem con- 
sists of N teams. The input contains a list Pi,... ,p;v, where pi is an integer 
representing the points that team i have been awarded in the games they have 
played so far. The input also contains a list of games that still have to be played. 
W.l.o.g., we can assume that the Emopean Football Elimination Problem asks 
whether team number N can still become champions. Furthermore, we assume 
that team N has no remaining game to play. The reason why we cem do so is 
that we can assume w.l.o.g. that team N wins all of its remaining games, should 
there be any. 

We now represent an input instance to EFEP by an undirected multigraph 
which contains N labeled vertices 1, . . . ,N. Vertex i stemds for team i and it is 
labeled with the number pi — Pn- By this choice, a negative label on vertex i 
means that team i has less points than team N aind a positive label means that 
team i has more points than team N. The edges are constructed as follows: An 
edge between vertex i and vertex j stands for one game between the teams i and 
j that still has to be played. If there is more than one remaining game between 
teams i and j, there is a corresponding number of edges between vertices i and j. 
Games that are already played have no corresponding edge in the graph. (Thus, 
vertex N is an isolated vertex which has label 0.) 

In order to make our proofs techniccilly easier, we will make the following 
slight modification which we refer to as “modification (+)”. We modify the rule 
of how to award points as follows: For a game between teams i and j that is 
not yet played, both teams get one point. This modification does not have any 
influence on the problem since at the end of a season, when all games axe played, 
the points awarded to the teams are not changed under this new rule. We will 
assume that modification (*) is already teiken into account in the input to the 
European Football Elimination Problem. 
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3 The Reduction 

In this section, we prove the following result; 

Theorem 1. The European Football Elimination Problem is AfV-complete un- 
der the (3, l)-rule. 

It is clear that EFEP is in N'T, since we can guess the outcomes of the remain- 
ing games and compute the final ranking. For showing the AAP-completeness, we 
reduce the well-known AAP-complete problem 3-SAT to EFEP. The reduction 
is easily seen to be polynomial-time. 

We proceed as follows; for each input formula to 3-SAT, we construct a 
labeled multigraph which corresponds to an input to EFEP. First, we show that 
the 3-SAT-formula is satisfiable if and only if team number N can still become 
champions. We then show that the situation described by the labeled multigraph 
can in fact arise dming a football season. 

An input to 3-SAT is given by a formula in conjunctive normal form where 
each clause contains exactly three litercils on different variables. We assume that 
the formula contains m clauses on the variables Xi, . . . , x„. By possibly copying 
clauses, we can assume that m > 2 is a power of two. The problem 3-SAT 
asks whether the input formula has a satisfying assignment from {0, 1}” to the 
variables xi, . . . , a;„. It is known to be an A/^P-complete problem. 

As an example, the formula (xi V ^ V A (xs VX 2 V xi) might be an input 
to 3-SAT which has a satisfying assignment (1, 1,0,0). 

Given an input formula for 3-SAT, we construct an input to the European 
Football Elimination Problem by describing the labeled multigraph. In the con- 
struction of the multigraph, we use the following components; 

For every variable xj, we have a full binary tree (called “xj-tree”) which 
looks as follows. (The number m of clauses determines the depth of the tree.) 



-^l 




m leaves m leaves 

for X for X. 

J J 

The Xj-tree has exactly 2m leaves. The leaves in the left subtree of the root 
are called “xj-leaves” , while the others are referred to as the “xj— leaves” . The 
root is labeled with -t-1, the leaves are labeled with —2, the other (inner) vertices 
are labeled with 0. 

Let us appeal to the intuition why this tree is of use in our reduction. Call 
the vertex at the top of that tree A and its two children B and C. 

If team N axe to become champions, then we have to achieve that no team 
has more points than team N, i.e., no vertex should have a label larger than 0 at 
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the end of the season. In particular, we have to make siure that vertex A no longer 
has a label larger than 0. How can we achieve this? Team A has two remaining 
games, namely, against its two children B and C. According to modification (*), 
team A did already get one point for each of those games. Thus, we have to 
make smre that team A loses at least one of its two remaining games. Assume 
that A loses its game against team B. This means that the label of team B is 
increased to 2, since team B now gets three points for the game against A while 
previously, they only got one point for this game. 

If team N are to become champions, we have to make sure now that the 
label of team B is not larger than 0 at the end of the season. With arguments 
similar to the ones we just gave, it follows that team B has to lose both its 
games against its two children. The argument can be repeated, i.e., we have an 
avalanche effect, where all the inner vertices have to lose their games against 
their children. This avalanche can obviously only be stopped in the leaves of the 
tree, when the labels of the leaves are increased from —2 to 0. 

For the intuition, we remark that the decision whether team A in the r^-tree 
loses against team B or against team C corresponds to the decision whether to 
assign xj —Ooixj — 1. 

Let us now continue to describe the reduction. For every clause number i, 
we introduce an extra vertex C, which is labeled with +1 and which we connect 
with three vertices. Namely, if this i-th clause contains the literal Xj, then we 
connect Ci with the i-th xj-leaf. If the i-th clause contains the literal xJ, then 
we connect Ci with the i-th Xj- leaf. Finally, we add an isolated vertex with 
label 0, for “team iV” . The reduction can be visualized as follows: 



+1 +1 




We first show that for the above input instance to EFEP, the following holds: 
Team N can become champions if and only if the 3-SAT-formula is satisfiable. 
We then have to show that the input instaince can in fant arise during a season, 
i.e., we have to show that the configuration of points and remaining games can in 
fact arise by appropriate outcomes for the games that have already been played. 

Lemma 1. Team N can become champions in the constructed EFEP-instance 
if and only if the input 3-SAT-formula is satisfiable. 
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Proof. Assume that the input 3-SAT-formula is satisfiable and let (ai, . . . , a„) e 
{0, 1}” be a satisfying assignment. We choose the following outcomes for the 
remaining games: 

If aj=0, then we declare all edges (=games) in the left subtree of the Xj— tree 
to be won by the child. We declare all edges in the right subtree to be ties. This 
has the following effect on the labels: The labels in the right subtree do not 
change, and the label of the root and all vertices in the left subtree is 0. 

If aj=l, then we proceed in the same way, with the roles of the left and right 
subtrees interchanged. 

The only remaining games are between the clause vertices Ci and some leaves. 

Since the assignment (ai,... ,a„) is a satisfying assignment, we know that 
for every clause, there is one hteral in the clause which is assigned a 1. Consider 
clause number i and assume that it was satisfied by the assignment xj := 1. 
We declare the outcomes of the games of vertex C, as follows: Ci loses its game 
against the xj-leaf and its other two remaining games are ties. 

The label of Ci changes to 0, and since Xj was assigned a 1, the label of the i- 
th xJ-leaf was —2 and is now 0, while the other two leaves that Ci is incident to, 
have their labels unchanged. Altogether, we obtain that a satisfying assignment 
to the 3-SAT-formula can be turned into outcomes for the remaining games 
such that all teams/ vertices have a label of 0 or smaller, i.e., team N can still 
become champions. 

We now show the other direction, i.e., given outcomes of the remaining games 
such that no team has more points than team N, we construct a satisfying 
assignment for the 3-SAT-formula. 

We first examine the root of an Xj— tree. By the remarks before Lemma 1, the 
root loses at least one of its two remaining games. Assume that the root loses 
against the left child. Due to the avalanche effect, all inner vertices of the left 
subtree lose their games against their children. 

We now define an assignment as follows: If the root of the Xj-tree loses 
against its left child, then let Xj = 0. Otherwise, let xj = 1. This yields a 
satisfying assignment for the 3-SAT-formula, as we show now. 

If it was not a satisfying assignment, then there would be at least one unsat- 
isfied clause. Assume that clause i is not satisfied. If clause i contains a literal 
Xk, then the corresponding edge is connected with a Xfc-leaf. Since clause i is not 
satisfied, variable Xfc was set to 0 and thus the label of the x^-leaf must be zero 
(due to the avalanche effect). Thus, vertex Ci has three outgoing edges which 
all enter leaves which have label 0. 

Thus, it is not possible that Ci loses any of its three remaining games, because 
otherwise one of those leaves would have a label which is larger than 0. This 
means that vertex Cj has a label of at least -|-1, and team N cannot become 
champions which is a contradiction. □ 

Constructing the set of games already played 

We now show that the above configmation can in fact occm during a season. 
I.e., we find a list of already played games and outcomes for them such that the 
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situation of the season is represented by the multigraph described in the reduc- 
tion. Let N be the number of vertices resulting from the described reduction. 
We will embed the problem into a season with a Icirger number of teams, where 
the additional dummy teeims just serve the purpose of adlowing us to adjust the 
labels of the teams we are interested in. 

Let V = {ui, . . . , Ujv} be the set of the N teams which occurred in the reduc- 
tion previously described. We introduce two new sets of teams V = {v '^ , . . . ,v'^} 
and V" = {<, . . . , 

At the beginning of a season, we assume that every team from the set V U 
V U V" has to play against each other team from the set V \JV VJ V" exactly 
f > 1 times. If t > 1, then we can, for each pair of teams i and j, declare t—1 
games between teams i and j to be a tie. Then all teams have an equal number of 
points, and for every pair of teams i and j, there is exactly one game remaining 
between the two. Thus, in the following, we can restrict ourselves to seasons with 
f=l. 

We now want to describe a situation during the season which leads to the 
multigraph described in the reduction. For this purpose, we declare which games 
have already been played by describing the outcomes of the games as follows: 
The outcome of a game between team i and team j is: 

a win for team i if i 6 F' and j 6 F, 

a win for team i if i € F and j 6 F", 

a tie if i 6 F' and j e F", 

a tie if i E V and j 6 F', 

a tie if i € V" and j E V" . 

The only remaining games are the games between teams in the set F. Due to 
modification (+), it does not make a difference if we declare some of the games 
in the set F as games that were already played and that ended in a tie. Thus, we 
can achieve that the remaining games cure exactly the remaining games between 
teams from F which are needed in the reduction. 

According to the above described outcomes, we have the following situation: 

Teams from F have {N-1) 3 • {2N) = 7N-1 points. 

Teams from V have 3 • AT -|- (IV— 1) -|- 2N = 6iV— 1 points. 

Teams from V" have (2iV— 1) N — 3iV— 1 points. 

For example, the 7N—1 points for a tecim from F result from the observation 
that the N—l games against the other teams from F are not yet played (or a 
tie), which, due to modification (♦), makes up for N—l points. On the other 
hand, there are 2N wins against the teams from V". The other points can be 
verified similarly. 

The team for which we want to decide whether it is eliminated or not is in 
the set F. Hence, the labeled multigraph corresponding to the above situation 
of the season has labels 0 on the vertices from F and labels —N and —4N on 
the vertices of V and V", respectively. 
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The reduction described earlier requires that some of the teams in V have 
labels +1 and —2 instead of 0. We achieve this by using the dummy teams from 
V and V" as follows: 

If a team Uj € K is to have a label +1, then we declare the game between Vi 
and v'i to be a tie instead of a win for This increases the label of Vi from 0 to 
+1 and reduces the label of v[ by 2. 

If a teeim Uj e F is to have a label —2, then we declare the game between Vi 
and v" to be a tie instead of a win for t>,-. This reduces the label of v, from 0 to 
—2 and increases the label of v" by 1. 

After these changes, the labels of all teams v S V are as required in the 
reduction. 

We have shown that during a season, the situation needed described in the 
reduction can in fact occur as a subproblem in a larger season. We only have to 
argue that the two problems are equivalent. 

Note that in the above season with 4N teams, the team for which we want 
to decide eUmination has 7N— 1 points. Hence the champions also will have at 
least 7N—1 points. The teams from V and V" have at most max{67V-l, 3A^} < 
7N— 1 points and the only remaining games are between teams from V. Thus, 
EFEP on the set U U U' U V" is identical to EFEP on the set V. 

4 Restricted Versions of EFEP 

Taking a close look at the reduction in Section 3, we see that there is a degree 
bound: Every team has at most three remaining games to play. Thus, our re- 
duction also shows that the restricted European Football Elimination Problem, 
where all teams have at most three remaining games, is already VP-complete. 

Natmally, the question arises what the situation is like if there are at most 
two remaining games for each tecim. We obtain the following result: 

Theorem 2. The European Football Elimination Problem under the (3, l)-rule 
can be solved in polynomial time if there are at most two remaining games for 
each tea'/Ht 

Proof. Consider the labeled input multigraph which describes the given instance 
of EFEP. Since every vertex has degree bounded by two, the graph is the union 
of components which are paths and cycles (and isolated vertices). It is enough to 
decide (independently of each other) for each of those paths and cycles whether 
there is an outcome for the remaining games such that all teams have a label 
which is not larger than zero. 

First, consider a path with an end vertex v which has label a. 

a 

• • • • • 

If a > 2, then team N can no longer become champions, since the label of v 
will be at least -1-1. 

If o = 1, then it is clecir that in order to make team N champions, the team 
corresponding to v has to lose its only remmning game. Hence, we can reduce 
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the path by removing v and increasing the label of its only neighbor by two. 
Similarly, the following operations can be applied: 

If a € {0, —1}, then remove v and define the incident edge (=game) to be 
a tie. Team N can become champions if and only if it can become champions 
for the reduced path. (A moment’s thought shows that the tie is the best result 
possible for team N.) 

If a < -2, then remove v and define the incident edge (=game) to be a 
win for the removed team. Team N can become champions if and only if it can 
become champions for the reduced path. 

Thus, in a linear number of steps, we can reduce a path to a trivial graph 
for which EFEP can be decided easily. 

Now, consider a component which is a cycle. We choose one of the edges of the 
cycle and consider all three possible outcomes for the corresponding game. For 
each of the possible outcomes, we remove the edge from the cycle and adjust the 
labels correspondingly. We obtain a path which we can deal with as described 
above. Altogether, we can decide for every component in hnear time whether 
there is an outcome for the remaining games such that all teams in the component 
have a label at most 0. □ 



5 General (a,/3)-rules 

The A/^P-completeness result which we have presented above can also be gen- 
eralized to the case of the (a;,/3)-rule, where a > 2/3 (and /?>!). This can 
basically be achieved by replacing a label -|-1 in the reduction by a label (3 and 
by replacing a label -2 by {/3—a), while labels 0 remain the same. 

We axe also able to prove the ArT^-completeness for the (a,/?)-rule, when 
0 < a < 20 (and 0 > 1). Here, we have to use a different multigraph in 
the reduction, but the technique still is very similar to the one that we have 
presented. We omit the details of the proof. 
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Abstract. Complexity clsisses, which are defined via finite commutative 
monoids, can be considered as (very) reguleir counting clEisses. These 
include well-known classes like NP, coNP, ©P, other MOD-classes, but 
also the classes of finite acceptance type, and many more. 

In these c^lses, the acceptance mechanism can be defined by a regular 
leaf language, where acceptance reaJly depends only on the number of 
occurrences of the various letters in the actual leafstring. In other words, 
the acceptance mechanism is given by a symmetric regular language. 
Generally all classes described in this way are the so called eventually 
periodic counting classes. 

In this paper we relax the symmetry condition on the regular leaf lan- 
guage: We allow all regular leaf languages, but we admit only machines, 
which on all input words will only produce symmetric leafstrings, which 
means all appearing leaf strings will either under all permutations belong 
to the acceptance language, or under all permutations not belong to the 
acceptance language. 

We give an exact chairacterization of all complexity classes, which can be 
described in this manner. It turns out that besides the classes obtained 
via finite commutative monoids, we also can describe promise classes like 
UP or MODZ 2 P in this way. 



1 Introduction 

Counting classes have been considered for about 20 years now. The investiga- 
tions started, when Valiant [17] introduced the class #P of functions, which 
output the number of accepting paths of a given nondeterministic polynomial 
time machine on the respective input. The most famous counting classes, in- 
cluding P, NP, ©P have been intensively investigated in the 80’s and early 90’s. 
Their general form is: take a polynomial time nondeterministic machine, count 
the number of accepting computations of this meichine on a given input x, and 
finally decide upon acceptance by checking, whether this number is in a given 
easily recognizable subset of N. 

This concept can be generalized by ciUowing several difierent types of accept- 
ing paths (but only finitely many), or equivadently by having several (but finitely 
many) different polynomial time machines working at the same time on the same 
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input. Then we have to check a vector of natinral numbers for membership in a 
given, again easily recognized subset of N*' . 

In both cases the acceptance condition forms a (usually regular) language 
which is symmetric in the following sense: The membership of a word in the 
acceptance language only depends on the Parikh vector of this word. Articles on 
this subject include [5, 1,8]. 

In this paper we also assume a regulcir acceptance language. But we do not 
demand the whole language to be symmetric. Instead we only assiune that the 
actually appearing leafstrings are symmetric. In other words: the order, in which 
the computation paths of the underlying nondeterministic polynomial time ma- 
chine appear, should be completely irrelevant. In this way the defined complexity 
classes become promise classes, the promise being, that no nonsymmetric word 
will appear as leafstring on any input. We call complexity classes defined this 
way generalized regular counting classes. 

A different (but of course related) Une of motivation for this work came 
from the theory of serializable computations, which was introduced in [4] by Cai 
and Furst: A series of deterministic polynomial time (or even logarithmic space) 
computations is executed, where each computation takes as input the “original” 
input X, a number i (the job number), and the result of the job with number 
i — 1, which was executed immediately before job i. The number of jobs may be 
exponential in the length of x. The result of each job is a number from a fixed 
finite set, w.l.o.g. the set {!,..., /c}. The final job decides upon acceptance or 
rejection. 

Cai and Furst showed that for A: > 5 the class of problems decidable by such 
a procedure is exactly PSPACE. Trivially, for fc = 1 one obtains only languages 
in P (and all of them). Ogihcira [16] showed that for A: = 2 the class of accepted 
languages is 0OptP (defined in [10]), which in the terminology of [12] can be 
written as P^ ’ ®-P. Finally, also in [12], it was shown that for A: = 3 and A; = 4 one 
obtains the classes P^= ®-(VAM0D3©)P and P^^ ®-(VAM0D3©)(3V©M0D3©)P, 
respectively. 

Beigel and Straubing [2] defined the concept of local self reductions and 
exhibited a strong connection to Ccii and Furst’s bottleneck meichines. Thus the 
general solution from [12] also solved the question for the exact power of several 
types of local self reducibiUty. 

An interesting modification of the bottleneck machine model was defined by 
Hemaspaandra and Ogih 2 ira [9], who restricted the computation to those cases, 
where the acceptance/rejection question is decided in the same way, no matter 
in which order the jobs are executed. They defined the classes SSFfc (symmetric 
safe storage via k tokens) and showed that aiU such classes axe subclasses of the 
polynomial time hierarchy with MOD-quantifiers, i.e. MOD-PH. 

We slightly generaUze their idea in a very natural way. Just like the ordinary 
bottleneck machine model was translated into the leaf language formulation 
(which led to a complete characterization in [12]), we can translate the symmetric 
bottleneck machine model to the leaf Icinguage machinery, too. We will do this 
in Section 3, after a short introduction to the leaf language area (Section 2). 
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2 Leaf Languages 

In this section we will briefly introduce to the concept of leaf languages. Espe- 
cially we are interested in regular leaf languages. 

The concept of leaf languages was defined by Bovet, Crescenzi, and Silvestri 
[3] and independently by Vereshchagin [19]. Given a language L one can define 
the class BalancedLeaf^(L) by the following definition: An admissible machine is 
a polynomial time nondeterministic Turing machine, whose computation paths 
are balanced, i.e. all paths are equally long and they are numbered in such a 
way that a polynomial time deterministic machine can find out the path related 
to a given number and vice versa, and the machine on every path produces as 
its result a letter from the alphabet i7, over which L is defined. 

Now, if an input is given, the machine determines in a unique way a word 
over the alphabet E, if we just read all the letters produced as results on the 
paths in their given order. The machine is said to accept, if this resulting word 
is in L, otherwise it is said to reject. 

The leaf language class described by L is the class of all sets A, for which 
such a machine exists, which accepts the input, if and only if it belongs to A. If 
an “acceptance language” L and a “rejection language” R Eire given satisfying 
X n iZ = 0, we obtain a promise class. The promise is that every leafstring will 
be in X U R. The according class is denoted by BeilancedLeaf'^(X; R). 

A number of articles based on this concept have been written in the past 6 
years, including [14, 15, 11, 18, 12]. The main result from the pioneering work of 
[3, 19] that attracted many researchers’ attention is the feict that for leaf language 
definable complexity classes a criterion cem be given for the existence of an oracle 
separation of these classes. This is a strong criterion, saying an oracle separation 
exists if and only if the leaf language of the one class cannot be reduced to the 
leaf language of the other class by a so-CEiUed polylogtime reduction. (In case 
of acceptance and rejection languages the same can be formulated for reduction 
between pairs of languages.) 

The criterion in its original and most general form in many cases is very hard 
to apply. Thus, in [11] a general technique was developed, by which for a very 
general form of bounded counting classes (the number of accepting paths of a 
nondeterministic machine have to be counted up to a fixed value) an algorithm 
could decide the existence of a sepauating oracle. In [13] it was shown that 
regular leaf languages given by the word problem over a finite commutative 
monoid always lead to characterizations of eventually periodic counting classes. 



3 Serializability 



Instead of first defining universal serializability in the original way of [9] and 
then showing that it is a special case of some sort of leaf language approach, we 
just define serializabihty and universal serializabihty in terms of leaf languages. 
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Definition 1. A class C is serializable, if there is a regular language L, such 
that 

C = BalancedLeaf^(L). 

It is an easy exercise to prove the equivalence of this definition with the 
original one. Moreover, this connection has been used in several papers in the 
past, see e.g. [12]. 

Universal serializability differs from the general serializability in the assump- 
tion that the acceptance behavior does not depend on the order, in which the 
jobs are executed. We can also translate this to the leaf language approach - 
here it means that the leaf string’s membership in the language may not depend 
on the order, in which the symbols appear. 

Thus, we have to define the symmetric restriction of a language L. Let w and 
w' be two words of length n. We say, w' cam be obtained from w by permuting let- 
ters, if there is a permutation tt on the set {1, . . . , n} such that w — wiW 2 ■ ■■Wn 
and W' - • • • «^,r(n)- 

Definition 2. Let L be any language over alphabet E. The symmetric restric- 
tion of L, denoted by Sym(L), is defined as the set of all strings w E. E* , such 
that every w' E E* , which can be obtained from w by permuting the letters, is in 
L. 



Note that Sym(L) C L for all L C E*. Especially, also Sym(L) C L, and 
thus 

Sym(L) n Sym(L) = 0. 

Now we are prepared to define our version of universal serializability: 

Definition 3. A class C is universally serializable, if there is a regular language 
L, such that 

C = BalancedLeaf*’(Sym(L); Sym(L)). 

This is essentially the same definition as the one given in [9], only the accep- 
tance mode is left completely general here. Thus, universal seriailizability in oiu: 
definition can be seen as a generadization of the original one. 

Note that by the above definition, universailly serializable classes axe exactly 
those generalized regular counting classes introduced in Section 1. 

4 Promise Counting Classes and Parikh Sets 

Given an alphabet E = {ai, . . . ,<Jk}, the Parikh function Parikh: E* — > N*’ is 
defined as 

Parikh(u;) = . . , , 

For a set 5 C Z'*, we define 



Parikh(S') = {Parikh(iu) | tn G S'}. 
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Now, with every subset X C N* we can associate a language L\ C S* as follows: 
Lx = {weE*\ Parikh(u;) e X}. 

Clearly, Lx is a symmetric language, that is Lx = Sym(Lx) and Lx — Sym(Lx)- 

Let A and B be disjoint subsets of N*' . Then the promise counting class 
(^, B)P is defined to be the leaf language class BalancedLeaf^(LA; Lb). We 
call {A, B)P a recognizable promise counting class, if A and B are recognizable 
subsets of N** . Here we consider N* as a finitely generated commutative monoid 
and call a set recognizable, if its syntactic monoid is finite. 

Analogously, (A, B)P is called a semilinear promise counting class, if A and 
B are semilinear sets. A subset of is semilinear, if it equals a finite union of 
linear sets. Linear sets in turn are sets of the form 

{a + ni6i H h Urbr | n< € N, f = 1, . . . , r} 

for fixed vectors a, 6i, . . . , 6^. It is well known (see e.g. [7]) that the semilinear 
sets in are exactly the rational subsets of N* . Thus, instead of semilinear 
promise counting classes we could as well speak of rational promise counting 
classes. 

We recall that by Kleene’s theorem the recognizable subsets of N* are ex- 
actly those subsets which can be recognized by a finite automaton. Thus, clearly 
the class of recognizable subsets of N* is closed under union, intersection and 
complement. 

In [7] it was shown that also the class of semilinear subsets of N* is closed 
under boolean operations. And all recognizable subsets of N*' are semilinear. 

Finally we note the also well known fact that for a regular language L over S 
the set Parikh(L) is a semilinear subset of N* . (This follows from the fact that 
regular sets are rational, and homomorphic images of rationed sets are rational 
again, see e.g. Chapter 2 in [6].) 

5 Properties of Generalized Regular Counting 
Classes 

As we pointed out in the Introduction, generalized regular counting classes be- 
come promise classes in a very natural sense: If the alphabet over which the 
leaf language is defined has k elements, then there are two disjoint subsets 
A,B C N*, such that the class in question is of the form (A, H)P, namely 
for A = Parikh(Sym(L)), B = Parikh(Sym(L)). We will show now that this 
class is a semilinear promise counting class, where all unary words are admissi- 
ble leafstrings (i.e. either belong to the acceptance set or to the rejection set), 
and where the acceptance and rejection sets can be separated by a recognizable 
set. 

In the sequel we will always use the fixed alphabet E — {<ti, . . . , crfc}. More- 
over for i € {1, . . . , fc} we denote the i-th basis vector of N*' , (0, . . . , 0, 1 , 0, . . . , 0) 
with the 1 in the z-th position, by Ci. 
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Theorem 1. Let L C E* be a regular set, and let 

C = BalancedLeaf^(Sym(L); Sym(I()) 

be the according generalized regular counting class. Then there are semilinear 
sets A, B C N* , such that 

1) Vi G {1, , fc}Vn gN: n • Ci £ A\J B 

2) There is a recognizable set C N^, such that ACU and B CU. 

3) C = {A,B)V 

Proof. Let L and C be as stated. Define the sets A = Parikh(Sym(L)) and 
B = Parikh(Sym(L)). We first have to show that A and B are semilinear. But 
easily we obtain 

Parikh(Sjmi(L)) = Parikh(L). 

(Read carefully: this is not a typo, we really have Sym(L) on the left hand side, 
but L on the right hand side...) Since L is regular, it has a semilinear Parikh set, 
and since the complement of a semilinear set is semilinear again, we conclude 
that Parikh(Sym(L)) is semilinear too. Thus we proved the semilinearity of A, 
the semilinearity of B can be shown by symmetry, exchanging the roles of L and 
L. 

Now part 3) of the claim is trivially satisfied, pcirt 1) is easy. It remains to 
show part 2). To this end we use the normal form of words over E: A word 
w £ E* is in normal form, if for i < j every occurrence of <7i in w is to the left of 
every occmrrence of crj in w. (We can also say: the word u; is in a sorted form.) 
We define 

L' = {u; G L I w; is in normal form}. 

Then L' is clearly a regular set. Moreover, L' C L and Parikh(L') C Parikh(L). 
But on the other hand 



Parikh(Sym(L)) C Parikh(L'), 

and 

Parikh(Sym(L)) D Parikh(L') = 0. 

We define U = Paxikh(L'). Then, we just showed ACU and B CU. We claim 
that U is recognizable. This is well known, but it can also be proved elementarily 
by an argument similar to the proof of the pumping lemma for regular sets. Due 
to space limitations, we have to leave the details to the reader. 



6 A Characterization of Generalized Regular 
Counting Classes 

In this section we will show that the reverse direction of Theorem 1 holds too. 
Thus we obtain a complete charcicterization of classes that can be represented 
as generalized regular counting classes. 
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Theorem 2. LetC be a complexity class, for which there are two rational subsets 
A,BCN'^ satisfying the following conditions: 

1) Vi 6 {1, , fc}Vn eN: n- et E AuB 

2) There is a recognizable set U CN'’ , such that ACU and B CU. 

3) C = {A,B)P 

Then there is a regular set L over E, such that 

C = BalancedLeaf*'(Sym(L); Sym(L)). 

Proof. Let C, A and B be given, such that A and B are rational, i.e. semilinear 
subsets of N* and let conditions 1) to 3) be satisfied. We have to construct a 
regular language L with the claimed property. 

First we construct a language Ya, depending only on A, as follows; We know 
that A is semilinear, since A is. Thus A can be written as a finite union of linear 
sets Ai, where A{ is of the form 



Ai = + Y^ibf ■ N). 

j=i 

Here a^'^ and all the 6^^ are fixed vectors from N* . We form a set X-;^, consisting 
of all words from E* of the following form: 

- The word starts with its maximum letter; i.e. the first letter is some CTj, and 
all other letters are from {cri , . . . , <7i}. 

— The word tu is a concatenation w = Z 1 Z 2 . . . Zg of partial words where the 
Parikh vectors of the z^ are either a^’^ or some bj'\ Here, i is fixed for all z„, 
but j may vary for different i/. Moreover, there is a value of i^, such that the 
Parikh vector of z^, is a^’\ and all other ior v have Parikh vectors of 
the form b^^\ 

X-^ is a regular set: A nondeterministic finite automaton can first guess the 
value of i, and then it can guess, whether to seek for a partial word with Parikh 
vector a^'\ or for a partial word with Parikh vector 6^^ (any j). If it fails, the 
automaton rejects. If it finds the guessed partial word, it continues this way. 
On the fiy the automaton also verifies that the first symbol of the input word 
was the maximal one, and that exEictly one parti^ll word with Parikh vector 
occurred. 

As we can find all possible Parikh vectors from A in this special form in a 
word from X-^ and conversely all words from X-^ have Parikh vectors from A, 
we obtain 

Parikh(X;4) = A. 



Now, defining 
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we obtain 

Parikh(Sym(yA)) = N* \ Parikh(i^) = \ Parikh(X34;) = N’°\A^A. 

Since X-^ is regular, also Ya is regular. So we have found a regular language 
Ya, such that Parikh(Sym(y4)) = A. Analogously we find a set Yb, such that 
Parikh(Sym(yjB)) = B. Using Ya and Yb we define our regular language L as 
desired: 

L = Li U Z/2) 

where 



Li = {w \ Parikh(ti;) € U A (ta € lOi V u; is in normal form)} 

— {w \ Parikh(w) ^ U Aw ^ Yb Aw is not in normal form}. 

Clearly, L is regular, since the check whether the input’s Parikh vector is in U, 
can be performed by a finite automaton, and the same holds for the membership 
check in Ya or Yb respectively and for the normal form check. 

We have to show Parikh(Sym(L)) - A and Parikh(Sym(I/)) = B. 

So let w e Sym(X). If w is unary, it can only belong to L, if its Parikh vector 
is in U. But this means that the Parikh vector is not in B, and since all Parikh 
vectors of unary words are in A U B, it has to be in A. 

If w is not unary, then there are permutations of w which are not in normal 
form and at least one permutation in normal form. This permutation can only 
be in L, if its Parikh vector is in U. But the Parikh vector of w is the same as 
the one of all its permutations. Thus we obtain Parikh(u;) S U. Then all those 
permutations of w which are not in normal form have to belong to Ya, otherwise 
they were not in L. If Parikh(u;) € A would hold, then there would also be 
a permutation w' of w, which starts with its maximum letter and contains its 
letters in an order as assmned for words in X-^. But then w' G X-^ would follow, 
meaning that w' ^ Ya, contradicting our knowledge on w'. Thus Parikh(u;) = 
Parikh(iy') must be in A. 

By symmetry we obtain for all words w G Sym(L) that Parikh(u;) G B. 

So far we showed Parikh(Sym(L)) C A and Parikh(Sym(L)) C B. Now let w 
be such that Parikh(tt;) G A. We will show w G Sym(L). Prom Parikh(Sym(yA)) == 
A we obtain w € Ya, and thus, since A C U, also w E L. This is valid for all 
words with the same Parikh vector as w, so w € Sym(L) follows. 

Analogously we obtain w G Sym(iy) for all w such that Parikh(it;) G B. This 
completes the proof. 

Corollary 1 . The generalized regular counting classes are exactly those semilin- 
ear promise counting classes, where the coordinate axes belong to the acceptance 
or rejection sets, and the acceptance and rejection sets can be separated by a 
recognizable set. 

Or equivalently: The classes characterizable by universal serializability are 
exactly those semilinear promise counting classes of the above type. 
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Moreover, we get an upper bound for all classes definable by universal seri- 
alizability, if we use the fact that under the conditions of Theorem 2 we have 

{A,B)PC{U,U)P. 

Corollary 2. For all classes definable by universal serializability, there is a reg- 
ular counting class that contains the given class. 



7 Case Studies 

In this section we will carry out our general construction in two very easy cases. 
This is just for presentation reasons; of course the technique works also for much 
harder characterizations. Here om intention is only to give a flavour, how the 
results can be applied: 



7.1 Characterizing UP 
Consider the following sets in : 



A = 




u 


1 — — — 1 

1-H O 
+ 

1 I 


B = 




u 


f M • N 

1^0 J \o J 



These sets are even recognizable, especially semilineax. Obviously, B contains 
both coordinate axes, and if we choose U = B, then f/ is a recognizable set 
separating A and B. 

Thus there is a regular set L, such that 

BalancedLeaf^(S}Tn(L); Sym(L)) = (A, H)P. 

On the other hand one can easily show by direct simulations that {A, B)P = 
UP. Consequently, we have proven that the promise counting class UP can be 
characterized by universal serializability. 

In fact, we can explicitly describe the regul^lr set L obtained for that char- 
acterization: 

L = {a}* U {6}* U oo{a}*W{6}*. 

Clearly, symmetric words in L can only be the unary words. Non-unary words 
having exactly one a or exactly one b ^^re symmetric elements from L. All other 
words have at least two a’s and at least two 6’s, and thus their normal forms are 
in L, while all other permutations are in L. 

Using Corollary 2 we obtain UP C NP, a fact which is of course well known. 
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7.2 The power of a small fraction of the group S 3 

We investigate the group S3, given as {e,b,b‘^,a,ab,ab'^}, where ba — ab'^ and e 
is the unit element. We consider the language 



L — {w e. {a,b}* \ w = e in S3}. 



We obtain Sym(L) = {a^" | n > 0} U {6^” | n > 0} and Sym(L) consists of three 
parts: those words with an odd number of a’s, those words with exactly one b, 
and those words without any a, where the number of b’s is not a multiple of 3. 
Then we obtain the following sets A and B in : 







and B = Bi U B2 U B3 U B4, where 





N 







B2 



Ba = 




N 



We may choose 





Then {/ is a recognizable subset of and we obtain the fact that the class 
BalancedLeaf^(Sym(B); Sym(L)) is contmned in BalancedLeaf^(L[;), which can 
easily be seen to be coMODgP. In fcict, a thorough inspection of the sets A and 
B shows that 



BalancedLeaf^(Sym(B);Sym(L)) = C 0 MODZ 2 P. 

8 Conclusion 

In this paper, we introduced the concept of generalized regular coimting. We 
showed that classes defined by this concept can also be described via universal 
serializability. Our main result is a complete chciracterization of those classes 
definable by universal serializability in terms of semilinear vector sets. 

We showed for two easy examples, how to apply the results. As an open 
question, we ask for an application to the monoid of all transformations on 3 
elements; the according language is an extension of the S3 language considered 
in Section 7. 

Acknowledgment I thank my colleagues in Stuttgart, Volker Diekert, Anca 
Muscholl, and Holger Petersen, for several helpful hints. 
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Quantum Turing Machine 
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Abstract. The notion of quantum TViring machines is a basis of quan- 
tum complexity theory. We discuss a general model of multi-tape, multi- 
head Quantum Turing machines with multi final states that also allow 
tape heads to stay still. 



1 Introduction 

A quantum Turing machine (QTM) is a theoretical model of quantum comput- 
ers, which expands the classical model of a Turing machine (TM) by allowing 
quantum interference to take place on their computation paths. Designing a 
QTM in general, however, is significantly harder than that of a classical TM 
because of its well-formedness condition as well as its halting condition, known 
as the timing problem. Recently Bernstein and Vazirani [2] initiated a study of 
quantum complexity theory founded on a restrictive model: a one-head, multi- 
track, stationary, dynamic, normal form, unidirectional QTM (for definitions, 
see Section 2) that prohibits a tape head to stay still. We call such a restrictive 
QTM conservative for convenience. 

One may find easier to program a less restrictive QTM when he wishes to 
solve a problem on a quantum computer. In this paper we wish to introduce a 
QTM as general as possible. In Section 2, we introduce a multi-tape, multi- 
head QTM with multi final states that also allows tape heads to stay still. 
Although many variations of QTMs cire known to be polynomially equivalent 
[2,3], unsolved is the question of what is the degree of polynomials of these 
simulation overhead. As we will show in Section 4, any multi-tape, multi-head, 
well-formed QTM can be effectively simulated by a conservative QTM with only 
cubic polynomial slowdown. 

Our primary goal is to contribute to the foundation of programming a handy 
QTM. In Section 3, we will prove two fundamental lemmas: Well-formedness 
Lemma and Completion Lemma, which are important tools in constructing a 
QTM. The lemmas expand the results of Bernstein and Vazirani [2], who con- 
sidered mostly conservative QTMs. Using the lemmas, we will show that any 
computation of a well-formed QTM can be reversed on a well-formed QTM with 
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Fellowship. 
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quadratic polynomial slowdown. We will also address the timing problem in Sec- 
tion 4. In Section 5, we will focus on an oracle QTM with multi query tapes and 
multi oracles. For any oracle QTM M, we can build an oracle QTM, similar 
to the classical case, that simulates M with a fixed number of queries of fixed 
length on every computation path. 



2 Definition of Quantum Turing Machines 

This section briefiy describes the formal definition of quantiun Turing machines. 
For om purpose, we wish to make the definition as general as possible. Here we 
present a definition that is slightly more general than the one given in [2, 1]. 

A k-tape quantum Turing machine (QTM) M is a quintuple (Q, {qo},Qf, T!i x 
Si X ■ ■ ■ X Ek,8), where each Si is a finite alphabet with a distinguished blank 
symbol #, Q is a finite set of interned states including an initial state qo and 
<3/ = {gj, 9 p... ,9™}, a set of final states, and 5 is a multi-valued, quantum 
transition function from Q x Si x Si x x Sk to C^xSi x.E 2 x - xi:kx{R,N,L}'‘ _ 
(Note that 5(q'p cr) must be defined.) For brevity, write S^'^^ for Si x x Sk- A 
QTM has two-way infinite tapes of cells indexed by Z and read/write tape heads 
that move along the tapes. Directions R and L mean that a head steps right 
and left, respectively, and direction N mean that a head makes no movement. 
We say that all tape heads move concurrently if they move in the same direction 
at any time (in this case, e.g., we write 5{p, cr, q, r, d) instead of 6{p, cr, q, r, d)). 
We call a QTM dynamic if its heads never stay still. A QTM is unidirectional if, 
for anypi,p 2 ,g e Q, € S^'^K anddi,d 2 S {L,N,R}'°, S{pi,cr 2 ,q,ri,di)- 
<5(F2) o’ 2 , Q, T 2 , di) ^ 0 imphes di = di- 

We assume the reader’s familiarity with the following terminology: a time- 
evolution operator, a configuration and final configuration, a superposition and 
a final superposition, a well-behaved and stationary QTM, and the acceptance 
probability of a QTM. For their definitions, see [2]. 

Here are ones different from [2]. A QTM is in normal form if, for every 
i e {1,2, . . . , m}, there exists a direction di G {L, N, ii}* such that 5{qf,(r) = 
| 9 o)|o’)|di) for any tr G A QTM M is called synchronous if, for every x, 
any two computation paths of M on x reach final configurations at the same 
time. The running time of M on x is defined to be the minimal number T such 
that, at time T, all computation paths of M on x reach final configurations. We 
write TimeAf(») to denote the running time of M on x if one exists; otherwise, it 
is undefined. We say that M on input x halts in time T if TimeM (x) exists and 
TimeM(®) — T. A QTM is well-formed if its time-evolution operator preserves 
the L 2 -norm. A multi-tape QTM is Sciid to be conservative if it is a well-formed, 
stationary, dynamic, unidirectional QTM in normal form with concurrent head 
move. We write Pm{x) to denote the probability that M accepts input x. 

Throughout this paper, T denotes a function from S* to N. 
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3 Fundamentals of Quantum Turing Machines 

In this section, we will prove two lemmas that are essential tools in programming 
a well-formed QTM: Well-Formedness Lemma and Completion Lemma. 

For convenience, the head move directions R, N, and L axe identified with 
— 1, 0, and -hi, respectively. 

Well-Formedness Lemma. One of the most significant feature of a QTM is 
the well-formedness condition on its quantum transition function that reflects 
the unitarity of their corresponding time-evolution operators. Here we present 
in Lemma 1 three local requirements for a quantum transition function whose 
associated QTM is well-formed. 

Let M = (Q, {qo},Qf, L'l x • • • x H/c, S) be a fc-tape QTM. Recall that 
stands for i7i x ■ • • x X'*,. We introduce the notation d[p, cr, rjej. Let D = {0, ±1}, 
E = {0, ±1, ±2}, and H — {0, ±1,[|}. Let (p, cr, r) 6 Q x and e G 

Define = {d G | Vi € {!,... ,fc}(|2d< -c^l < 1)} and = {e G | d G 
D^}, where d = (dj)i<i<fc and e = (ei)i<i<fc- Let where 

hd,e — 2d — e \i e ^ Q and hd,e — t| otherwise. Finally, we define 5\p,a,T\^ as 
follows: S\p,(T,T\e] = Y,q&Q <5(P.<^.9,r,d)|Xrf|-i/2|g)|/i^,^). 

Lemma 1. (Well-Formedness Lemma) A k-tape QTM M = {Q, {qo},Qf, Xi x 
X Ek,S) is well- formed iff the following three conditions hold. 

1. (unit length) ||(5(p, er)|| = 1 for all (p,<r) € Q x X(*\ 

2. (orthogonality) S(pi, Cl)- S(p2, 0-2) = 0 for any distinct pair (pi, ai), (p2, 1T2) € 
Q X X(*). 

5. (separability) 5[pi, <ti, ri|e] • 6\p2,(r2,T2\€'] — 0 for any distinct pair e, e' G 
X'“ and for any pair (pi,<ti,ti), (p 2 ,ct 2 ,T 2 ) G Q x 

The proof of the lemma is similar to that of Theorem 5.3 in [2]. Note that, 
since any two distinct tapes do not interfere, a fc-tape QTM must satisfy the fc 
independent conditions for the case fc = 1. We leave the detail to the reader. 

Completion Lemma. A quintuple M = (Q, {qo},Qf, Xi x • • • x X*,, 5) is called 
a partial QTM if J is a peirtial quantum transition fimction that is defined on a 
subset 5 of Q X Xi X • • • X Xfc. If 5 satisfies the three conditions of Lemma 1 on 
all entries of d, then we call M a well-formed partial QTM [2]. 

Completion Lemma says that any well-formed partial QTM can be expanded 
to a well- formed QTM. 

Lemma 2. (Completion Lemma) For every k-tape, well-formed partial QTM 
with quantum transition function S, there exists a k-tape, well-formed QTM with 
the same state set and alphabet whose transition function S' agrees with S when- 
ever 5 is defined. 

To show the lemma, we first consider how to change the basis of a given 
QTM. Let M = {Q, {go}, Q/, Xi x ■ • • x Xfc, J) be a given QTM. We first partition 
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£Qxh’’ mutually orthogonal spaces {C^ | e € E'^} such that (i) = 

span{C^ I € e E'°} and (ii) for any e G E'‘ and any (p,a,r) £ Q x 
5[p, CT, T|e] £ C^. Note that if |e| 7 ^ je'l then Cf nCf / = 0. For each e £ E’°, let 
be an orthonormaJ basis for . Let B be the union of all such B^ ’s. 

We assume that, at time t, M in state p scans symbol cr and that 6 maps (p, <r) 
toX)q_^ j S{p,o’,q, T,d)|g)|r)|d). Define the change of basis from Q x {L,N, 1?}*“ 
to Bx E'‘ by mapping | 9 )|d) into Let 

17i denote this transform. This matrix Ui is unitary because {d,q\UiUi\q' ,d!) = 
Eeg£dn£;j,(^d'.*.9'k>*d.*>(l^dl • == (d, 9 |g',d'), which implies 

UiUi = I. It is known in [2] that Ui preserves the L 2 -norm iff Ui is imitary. 

Let S'{p, (t) denote UiS(p, <r) for any (p, er) £ S. In what follows, we show that 
S' is “unidirectional” in the sense that if S'(p,<T,v,T,e) ■ S'{p',cr',v,T',e') ^ 0 
then e = e'. Let e and e' be distinct and in F*'. Note that the separability 
condition ensm-es that S\p,cr,T\e] ■ 5[p',<r',T'|e'] == 0 for any (p',ct',t') £ Q x 
Since S\p,<T,T\e] = ^ S'(p,er,v,r,e) = 0 

for any v £ B^> if e Therefore, S' is “unidirectional.” 

The transform Ui is useful to show Completion Lemma. We go back to the 
formal proof of Completion Lemma. 



Proof of Lemma 2. Let M = {Q, {qo},Qf, Si x ••• x Sk, S) be a given QTM. 
Let Ui be defined as above. As shown above, Ui is unitary. As a result, S'{S) is 
a set of orthonormal vectors since so is S{S). 

For each v £ B, let be e such that S'{p, cr, v, r, e) 7 ^ 0 for some (p, <r, r) £ 
Q X if any, and let e„ = (l)i<i<fc otherwise. Since S' is “imidirectional” , 

Ct, is uniquely determined. This implies that we can define the vector S"{p,cr) 
as S"{p,<r) = 

Now we expand S" to Q x by eidding arbitrarily extra orthonormal 
vectors associated with elements in Q x — S. Let S be such an expansion 
of S". We define s' by s'{p, a) = k)k)- 

We then apply the inverse transform to S {Q x S^'°^) and let S be the 
result obtained. Define M — (Q,{qo},Qf, S^'°\S). Since t/i is unitary, M must 
be well-formed. □ 

Completion Lemma also enables us to use a fc- tuple of a single alphabet, Z*^, 
instead of Zi x . . . x Zfc. In the following sections, we will deal only with a A:-tape 
QTM with tape alphabets Z^. 



4 Simulation of Quantum Turing Machines 

In this section we demonstrate several simulation results using the main lemmas 
in Section 3. Since we are interested only in the acceptance probability of a QTM, 
the “simulation” of a QTM M by another QTM M' in this paper regards with 
the statement that N produces the same acceptance probabihty as M does. 
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More formally, we say that M' simulates M with slowdown f if, for every x, 
fJ,M'{x) — Hm{x) and TimeM'(®) = /(TimeM(®))- 

Assume that M is a fc-tape well-formed QTM running in time T{x) on input 
X. For m > 1, let Mr,m denote the (fc-t-m)-tape QTM that, on input x in tapes 
1 to fc and in tape A: 4-1 and empty elsewhere, behaves like M on input x 
except that the heads in tapes A: 4- 1 to A: -I- m idle in the start cells. The tape 
alphabets of M^r,m for tapes 1 to A: are the same as M’s. 

For convenience, let M(|</>)) denote the final superposition of M that starts 
with superposition \4>)- In the case where \<j>) is an initial configuration with 
input X, we write M{\x)) for M(|0)). 

For any pair <r = ((Ji)i<i<*; and t = (rj-)i<j<m, (T *t denotes the (A; 4- m)- 
tuple (<Ti, . . . ,afc,Ti, . . . ,Tm). In particular, we write s * tr for (s) * <r and <t * s 
for <T * (s). 

Simulation by Synchronous Machines. We show how to transform any well- 
formed QTM into a well-formed, synchronous QTM with a single final state with 
the help of the information on its running time. 

Lemma 3. Let M be a k-tape, well-formed QTM that halts in time T{x) on any 
k-tuple input string x. Then, there exists a (A: 4-2) -tape, well-formed, synchronous 
QTM M' with a single final state such that, on input {x, it halts in time 

2T{x) 4- 2, the last two tape heads move back to the start cells, leaving 
unchanged, and pAr<(a:, //Af already has a single final state, 

then M' needs only A: 4- 1 tapes and satisfies M'{\x, = Mx,i{\x, 

Proof LetM = (Q, {go}) beagiven QTM with Q/ = {qj,qj,... ,g™} 

By Completeness Lemma, it suffices to build a pcirtial QTM M' that satisfies the 
lemma. Assume that x is given in tapes 1 to A: and 1^1®) is in tape A: 4- 1. Tape 
A: 4- 2 is initially empty. The QTM M' simulates each step of the computation of 
M using tapes 1 to A:, together with stepping right in tape A: 4- 1, which counts 
the number of steps executed by M. When M' Eirrives at any final configuration 
of M with final state 1 < i < m, at time exactly T(x), M' deposits the 
munber i (as a single tape symbol) onto tape A: 4- 2, freeing itself from state gj. 
Then, M' moves its k 4- 1st tape head back to the start cell in T(x) 4- 2 steps and 
enters its own final state g/. Thus, the rmming time of M' is exactly 2T{x) 4- 2. 

It is not difficult to check the well-formedness of M' using Well-Formedness 
Lemma. Note that the acceptance probability of M does not change during the 
above simulation process. Thus, = MAf(®)- 

If M already has a single final state g/, we modify the above procedme in the 
following fashion. Firstly, we replace every occurrence of g/ in <5 by g/. Secondly, 
we apply the above simulation procedme. Thirdly, after the simulation, we force 
M' to enter g/ as its final state exactly when the k 4- 1st tape head returns to 
the start cell. In this case, we do not need the k 4- 2nd tape at all. □ 



Simulation by Machines with Concurrent Head Move. The simulation of 
a multi-tape QTM by a single tape QTM is a central subject in this subsection. 
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We show that any multi-tape, well-formed QTM can be simulated by a certain 
well-formed, well-behaved QTM with concurrent head move. The simulation 
overhead here is a quadratic polynomial. This result malces it possible to simulate 
a multi-tape QTM by a single tape QTM with quadratic polynomial slowdown. 

Proposition 1. Let M be a k-tape, well-formed QTM that halts in time T{x) 
on input x. There exists a {k -i- 2)-tape, well-formed, well-behaved QTM M' 
with concurrent head move such that M' , on input x in tapes 1 to k and empty 
elsewhere, simulates M in time 2T{x)^ (2k -(- Q)T(x) ■+■ 4. Moreover, if M is 

synchronous, dynamic, unidirectional, or normal form, so is M' . In particular, 
when M is synchronous, M' can be made stationary with extra T(x) + 1 steps. 

Proof. Given a QTM M, we construct a new QTM M' that simulates in 
4r -f- 2A: -I- 7 steps the rth step of M by moving its heads back and forth in all 
tapes concmrrently and by expanding the simulation area by 2. Thus, the new 
QTM needs (4r -\-2k-\-7) steps (with an additional pre-computation of 4 

steps) to complete this simulation on input x. 

Let M = (Q, {qo},Qf,I!'‘,S)bea given QTM. The desired M', starting from 
state go, works as follows. Initially, in four steps we mark $ in the start cell in 
tape fc -I- 2 and we set up the simulation area of three cells (which are indexed 
-1, 0, 1) in tape fc -f 1, each of which holds the record of the head position of 
M. We will maintain this record in tape fc -1- 1 by updating a symbol (£Ti)i<i<jb 
in each cell, where a, = 1 means that the ith tape head rests in the current cell. 
Finally, M' enters state (qo,To,do), where tq = do = ($)i<t<fe- 

At round r, 1 < r < T(x), we simulate the rth step of the computation of M 
in 4r -b 2fc -f 7 steps. We start with state (p, To, do), provided that p is a current 
state of M. Moving the head rightward along all tapes toward the end of the 
simulation area, we collect the information on a fc-tuple r = (ri)i<j<*; of tape 
symbols being scanned by M at time r and we then remember it by changing 
om internal state from (p, tq, do) to (p, r, do). After the head arrives at the first 
blank cell, by applying the transition <J(p, r), we change (p, r, do) into (g, r, d) 
if S{p,T,q,(T,d) is non-zero. To end this simulation phase, we update the head 
position marked in tape fc -t- 1 (by using d) and tape symbols (by using r) by 
moving the head leftward to the first blank cell in tape fc 1. Whenever the 
head reaches an end of the simulation area, we expand this area by 1 by writing 
the symbol (0)i<j<*; in its boundary blank cell. After the simulation phase, M' 
enters state (g, To,do). 

Suppose that M is in normal form. It is easy to verify that no well-formed 
QTM in normal form has more than two final state. Let g/ be a single final state 
of M. Adding the rule J'((g/,ro,do),<7’) == |go)|(r)|i?) makes M' be in normal 
form. If M' is synchronous, then M' c£m use the marker $ in tape fc -|- 2 to move 
its head back to the start cell and erase $ from the tape in T(x) -t- 1 steps. This 
last movement forces M' to be stationciry. □ 

Any QTM with concurrent head move can reduce the number of tapes by 
merging a fc-tuple of tape symbols which the head is scanning, into a single tape 
symbol. 
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Lemma 4. Let 1 < m < k. Let M be a k-tape, well-formed QTM with concur- 
rent head move that, on input x in tapes 1 to m and empty elsewhere, halts in 
time T{x). There exists an m-tape, well-formed QTM M' such that, on input x, 
halts in time T{x) and simulates the computation of M onx. If M is dynamic, 
synchronous, stationary, unidirectional, or normal form, so is M'. 

Simulation by Dynamic Machines. This subsection is devoted to show that 
any well-formed QTM can be simulated by a certain conservative QTM with 
quadratic polynomial slowdown. 

Proposition 2. Let M be a k-tape, well-formed, synchronous QTM that halts 
in time T{x) on input x & There exists a 2k-tape, well-formed, stationary, 
synchronous, unidirectional, dynamic QTM M' such that, on input x in tape 1 
to k and empty elsewhere, M' simulates M in time 2T(x)^ -I- 16T(x) -h4.!fM 
has a single final state, then M' is further in normal form. 

Proof. The proof uses an idea of Veto [3]. Let M — (Q, {go},Q/> be a given 
QTM with Qf = . . . ,qj^}. We define the desired partial M' so that it 

simulates the r step of M by a round of 4r + 13 steps with all the heads moving 
concurrently. Since M requires T{x) steps, M' needs T{x) + 1^) steps 

together with a pre- and post-computation of T{x) -j- 4 steps, which gives the 
desired running time. 

We first show the proposition for the special case fc = 1. Let x = XiX2 • • • 
be an input given in tape 1. In the initial phase, we create in four steps the 
configuration (poi x[x 2 • • • 2;^, -1> $11, —1), where Xj = {qo, Xi) and po is a dis- 
tinguished state of M' and symbol $ is in the cell indexed —1. 

To understeind the simulation phase, we associate a configuration cf oi M 
with a certain configuration cf of M' defined in the following way. Assume that 
c/ = ( 9 , cont, k), where M in state q scams symbol a in the cell indexed k and cont 
is the content of the tape. At the beginning of round r, 1 < r < T{x), we create 
the configuration cf of M' which is of the form (po,cont]^, -r, l’’“^$l'’+^, -r), 
where l’’~^$l’’+^ is written in tape 2 with $ in the cell indexed —1 (which marks 
the simulation area) and contj^ is identicail to cont exeept that the cell indexed 
k has symbol (g, a) instead of a. 

To disregard any head direction that results from an appUcation of S, we 
treat as a single symbol the three consecutive symbols, where the head of M 
scans the middle symbol. In the course of the simulation, we first search in tape 
2 the three consecutive symbols ao; {q, cr i)',(T 2 , where M in state q scans ai, and 
encode them into the single symbol (<ro, (g, ai),<72) by moving the head back 
and forth. We then apply d to this symbol with stepping right. This makes M' 
dynamic and also unidirectional. Finally, we decode the result and update the 
content of tape 2. 

For each configmration at time r of M on input x, at the end of the simula- 
tion, M' produces its associated configuration. Therefore, when M enters a final 
configuration at time T{x), M' reaches a configmation in which a tape symbol 
of the form (qf,cr) is found in tape 2. When M' finds such a symbol, it enters 
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its own final state g/ in exactly T{x) steps. Let be the transition function for 
M'. 

For a general case k > 1, let p = {pj)i<j<k, <r = {crj)i<j<k, and r = 
(Tj)i<j<k- We first produce $; 1; 1 in tapes A: -f 1 to and enters state pg — 
(po)i<j<k with changing symbol <Ji in the start cell in tape i into {qo,(Ti). We 
then define <5^p, (t*t) to be the product (fi, ri)) (g)(5i(p2, T 2 )) <8> • • • <8> 

S'liPk, ((^k,Tk)). Clearly, this QTM is well-formed, stationary, and unidirectional. 
Note that the running time of the A;-tape QTM M' does not depend on the 
number of tapes. Since M halts at time T(x), M' finally enters state (q'f)i<j<k 
for some A:-tuple (*j)i<j<*; at time 2T(x)^ -t- 16T(a;) -|- 4. 

In the case where M has a single final state g/, we can add the new transition 
rule: 5'{qf,a * t) = |go)|o') 'r)|i2), where — (g/)i<j<k, which meikes M' be 
in normal form. □ 

Since the proposition regards with a vmidirectional QTM, it also gives an 
extension of Unidirection Lemma in [2] to multi-tape QTMs. 

Simply combining Propositions 1 cmd 2 and Lemmas 3 and 4, we obtain the 
following corollary. 

Corollary 1. Let M be a k-tape, well-formed QTM that, on input x in tape 1 
and empty elsewhere, runs in time T{x). There exist a quartic polynomial and a 
two-tape conservative QTM M' such that, on input M' halts in time 

p{T{x)) and satisfies = Pm{x). 

Note that, by modifying the simulation given in the proof of Proposition 1 
(with 0(iS(a;)T(x)) slowdown, where S{x) is any space bound of M), we can 
achieve a much tighter 0{T{x)^) time bound. The detail is left to the reader. 

Reversing a Computation. First recall Definition 4.11 in [2] that defines 
the notion: M 2 reverses the computation of Mi. Different from [2], we only as- 
sume that Ml and M 2 are well-formed QTMs (whose tape alphabets may differ) 
and that Mi has a single final state. We show below that we can reverse the 
computation of any well-formed QTM with quadratic polynomial slowdown. 

Theorem 1. Let M be a k-tape, well-formed QTM with a single final state 
that halts in time T{x) on input x. There exist a quadratic polynomial p and 
a 2{k l)-tape, well-formed, synchronous, dynamic QTM in normal form 
that, on input x in tapes 1 to k and 1^1®) in tape fc + 1 and empty elsewhere, 
reverses the computation of Mr,k +2 in time p{T{x)). 

Proof. Let M = (Q, be a well-formed QTM. By Lemma 3, 

we have a (fc -f l)-tape, well-formed, synchronous QTM Mi ruiming in time 
2T{x) + 2 on input {x, 1^(®)) that satisfies Mi(|x, 1^(®))) = Mt,i{\x, 1^(®^)). 

By modifying the proof of Proposition 2, we can show the existence of a 2(A:-|- 
l)-tape, well-formed, stationary, synchronous, unidirectional, dynamic QTM M 2 
in normal form such that (i) M 2 on input (x, I'^f®)) halts in time 0(T(x)^), (ii) 
when M 2 halts, tape A: 4- 1 consists only of its input and tapes A: -|- 2 to 




438 Tomoyuki YameJtami 



2k + 2 are empty, and (iii) M 2 ((®, is identical to when 

tapes fc + 2 to 2fc + 2 are ignored. 

It is easy to extend Reversal Lemma in [2] to any multi-tape QTM. Let 
be the QTM (as constructed in [2]) that reverses the computation of M 2 with 
extra two steps. Since M 2 is well-formed, synchronous, and dynamic, so becomes 
because of its construction. Since zmy final superposition of M 2 is identical 
to that of Mx,k+ 2 , the theorem follows. □ 

Theorem 1 leads to the following lemma. The proof of the lemma also uses 
an argument similar to that of Theorem 4.14 in [1]. 

Lemma 5. (Squaring Lemma) Let k > 2. Let M be a k-tape, well-formed 
QTM with a single final state which, on input x, outputs b{x) € {0, 1} in the 
start cell of tape k in time T{x) with probability p{x). There exist a quadratic 
polynomial p and a (2k + 3) -tape, well-formed, stationary, normal form QTM 
M' such that, on input (l'^^^\x), M' reaches in time p(T(x)) the configuration 
in which M' is in a single final state with in tape 1, x in tapes 2 to k, 

b(x) in tape fc -|- 1, and empty elsewhere, vrith probability p(x)^. 

Proof Let M be a given QTM. By Theorem 1, there exists a 2(fc-t-l)-tape, well- 
formed, synchronous, dynamic, normal form QTM M^ that, on input x), 

reverses the computation of Mr,fc +2 in time 0(T(x)'^). 

We define the desired QTM M' as follows. Let be any input. 

Starting with its initial configuration c/o, M' runs Mr,k +2 with ignoring tape 
2fc-f 3. Consider the final superposition Mr,*+2(|1^^®^ *))• When Mx,k +2 halts, 
M' copies the content of the start cell in the output tape into tape 2fc -I- 3 in 
two extra steps. Now we have the superposition \<i>) — Y^yOi^^y\y)\by), where 
by € {0, 1} is the content of tape 2fc -I- 3 and y ranges over all configurations 
excluding the status of tape 2fc -|- 3. Next, M' rrnis M^ steirting with \4>) with 
ignoring tape 2fc -I- 3. Note that M^(\<j>')) — |c/o)|6(x)) for the superposition 

By a simple calculation, we have (<t>'\4>) = ^yh„=b(g,) which equals 

p(x) since M^r, 2*;+3 outputs b(x) with probability p(x). 

Since M^ preserves the inner product, (M^{\4>'))\M^(\4>))) = {4>'\(j>), which 
is the amplitude in M'(|l^^®\x)) of |c/o)|6(x)). Thus, the squared magnitude 
of amplitude of |c/o)|6(x)) is exactly p(x)^. □ 

Timing Problem. Let M = (Q,{qo},{lf}, be a fc-tape, partial, well- 

formed, normal form QTM. We assiune that any computation path of M on 
input X is completely determined by 6 and ends with final state qf and that the 
length of any computation path of M on x does not exceed T(x). We modify 
5 by forcing S(qf,(r) to be |g/)|o’)|iV) for any a & and let 5* denote this 
modified 5. This makes M halt within time T(x). For clarity, let M, be 
the QTM defined by 5,. Although M» may not be well- formed, when the final 
superposition has unit L 2 -norm, we cem still consider the acceptance probability 
of M* as before. Can we simulate M* on a well-formed QTM? 
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For convenience, we say that M is well-structured ii (1) it is well-formed, (2) 
any computation path of M on input x is completely determined by 6 and ends 
with a single fined state, and (3) any fined superposition of M, on each input 
has imit i/ 2 -norm. For simpficity, we write (jlm{x;) to denote the acceptance 
probability of M, on input x. 

Lemma 6. Let M be a k-tape, well- structured, partial QTM in normal form 
such that the length of any computation path of M on each input x is less than 
T{x). There exists a {k + 3) -tape, well-formed QTM M' such that, on input x 
in tapes 1 to k and in tape fe -|- 1 and empty elsewhere, it halts in time 

0(T(x)^) and satisfies 

Proof. Let M be a given QTM. We first construct a well-formed QTM Mi 
running in time 0(T(a:)^) on input (x, such that the probability that M 

halts in an accepting configuration in which tape k-\-2 consists only of symbols 
0 LiogT(x)j+ 2 ^ is 2-Li°8^(®)J-VA/(a:). 

1. We produce in tape k 3 the “reversed” binary representation of T(x) 
in exactly 2T(x)^ -I- 12T(x) -H 9 steps. Using this representation, we produce 
[logT(x)J -I- 1 bit zeros (following control bit 1) in tape fc -|- 2. 

2. We simulate M’s move by incrementing two counters. The first counter is 
in tape fc-f- 1, of unary form, and the second one is a binary counter in tape k-i-2. 
At each rovmd of simulating a single step of M, M\ also increments the unary 
counter by stepping right and increments the binary one (using control bit 1) 
in exactly 2[logT(x)J -\- 8 steps. When M terminates. Mi keeps incrementing 
the unary counter but idles on the binary counter for each 2[logT(x)J 4- 8 steps 
(using control bit 0 in tape /c -I- 2 for reversibifity). 

3. After T(x) rounds, we apply a Hadamard transform, with stepping right, 
to the content of the binary counter except its control bit (i.e., S'{p,a * a') = 

i}(-l)‘^ ■’■|p)|o-, r)|lV, il), where a' is in tape k + 2). Since the length 

of this counter is [logT(x)J + 1, we Cem observe symbols in tape 

fc 4- 2 with amplitude Hence, the probability that M reaches an 

accepting configuration with in tape fc 4- 2 is (x). 

We design M' so that the heads in tapes fc4- 1 to fc4-3 return to the start cells 
(using in tape fc-t-l) and the rest of heads stay in the same cells as M’s. It is 
easy to see that Mi is in normal form if we add the rule: 6'{qf, a) = |go)|o')|lV). 
Moreover, if M is stationary. Mi is edso stationary. 

For the desired machine M', we design it to accept input (x, exactly 

when Ml reaches an accepting configmation with written in tape 

fc 4- 2. It thus follows that □ 

Lemma 6 solves the timing problem for any quantum complexity class whose 
acceptance criteria is invariant to a polynomial fraction of acceptance probability. 

5 Oracle Quantum Turing Machines 

Unlike the previous sections, we will focus on an oracle QTM, which is a natural 
extension of a classical oracle TM with the help of a set of oracles. 
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Formally, we define a {k + m)-tape oracle QTM M with m query tapes to be 
a septuple (Q, {qo},Qf, Qp, Qa, A x ■S '2 x • • • x Sk+m, S), where Q includes Qp = 
{9p, 9p, • • • , Qp ], a set of pre-query states, and Qa = 9a> • • • . O. a set of 

post-query states, and the transition function 5 is defined only on (Q — Qp) x 
We assume the reader’s familiarity with £in oracle query. For its definition, see 
[2]. Conventionally, we assume that every alphabet Ek+i, 1 < i < m, includes 
binary bits {0, 1}. Let A — (^i)i<<<m be a series of oracles such that each Ai 
is a subset of (Ek+i)*. Note that query states q'p and g* correspond only to the 
ith query tape and the ith oracle Ai. 

It is important to note that Well-Formedness Lemma and Completion Lemma 
hold even for oracle QTMs. 

Reducing the Number of Query Tapes. We can reduce the number of 
query tapes by combining a given set of oracles into a single oracle together with 
copying a query word written in one of query tapes into a single query tape. 
When we copy a query word yob from the ith query tape, we pad the suffix 
0*1"*~* (between y and b) to make the copying process reversible. 

Lemma 7. Let m > 2. Let M be a (k + m)-tape, well-formed, oracle QTM with 
m query tapes that halts in time T{x) on input x £ i7*. Let A = (j4j)i«<m 
be a series of oracles. There exists a (fc -f 2m + l)-tape, well-formed, oracle 
QTM M' with a single query tape such that, on input (x, halts in time 

5T(a)^ -I- 8T{x) and = pj^{x), where B = | y £ Ai}. 

Adjusting the Number of Queries. Let M be a given QTM. At the end of 
each round, in which a new QTM M' simulates a single step of the computation 
of M, we force M' to make a query (of the form 0 o 0) in 6 steps if M does not 
query. When M invokes an oracle query, we force M' to idle for 6 steps insteeid 
of mciking a query of 0 o 0. This proves the lemma below. 

Lemma 8. Let M be a {k + l)-tape, well-formed, oracle QTM in normal form 
with a single query tape that halts in time T{x)d, on input x £ E^. Let A be 
an oracle. There exist a (A: -1- 2) -tape, well-formed, oracle QTM M' with two 
query tapes running in time 7T{x) on input x such that M' makes exactly T{x) 
queries along each computation path and fj,]^;^\x) — pj^{x). 

Adjusting the Length of Query Words. We show that the length of query 
words can be stretched with quadratic slowdown. To extend the length of a query 
word to the fixed length T — 1, we pad the suffix in 4T -f 6 steps. 

Lemma 9. Let M be a {k-\- \)-tape, well-formed, oracle QTM in normal form 
with a single query tape that halts in time T{x) on input x £ E'^. Let A be 
an oracle set. There exists a (fc -f- Z)-tape, well-formed, oracle QTM M' such 
that, for every input {x, it halts in time 4T{x)^ -b 10T(®), the length of 

any query word is exactly T{x) — 1 on any computation path, and it satisfies 
^pj^{x), where B = \ y ^ A,m>\y\ -j-2}. 
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Abstract. Without comparison, the most pressing problem for the in- 
dustry of computing is the Year 2000 problem. 

In this talk we explain what the Year 2000 problem is and show its close 
connection to type theory. We present a new type discipline which allows 
users to find and correct Year 2000 problems in COBOL programs. The 
type discipline is implemented in a tool called AnnoDomini, which is sold 
as a commercial product for remediation of IBM OS/VS COBOL pro- 
grams (www.hafnium.com). Although developed specifically for business 
applications, AnnoDomini borrows heavily from research in program- 
ming languages. AnnoDomini is written in Standard ML, it provides 
users with abstract (year) types, it is implemented using unification- 
based type inference, it was specified using operational semantics, and 
the core of its design was guided by formulating and proving theorems. 
The talk presents the basic ideas of AnnoDomini and ends with a demo. 
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Abstract. We propose a uniform and systematic way of constructing 
solutions (in any admissible shape) for systems of subtype inequcilities. 
It is done via systems which we call interval systems. 



1 Introduction 

This paper deals with methodology of type reconstruction for lambda calculus 
with subtyping. A direct approach via constraints (see [Mit88]) leads in a natu- 
ral way to systems of subtype inequalities. It was shown in [HM95] that solving 
such systems is PTIME equivalent to the original problem of type reconstruc- 
tion. However, in general, this problem was shown to be PSPACE-hard [Tiu92]. 
This is the case when the poset of atomic subtypes forms a crown. It was later 
proved ([Pre97]) that for every poset of atomic types, the problem of deciding 
satisfiability of a system of subtype inequalities is in PSPACE, hence over crowns 
the problem becomes PSPACE-complete. 

In [Tiu92] it was proved that when the poset of atomic subtypes is a lattice 
(or even a disjoint imion of lattices), then the problem of satisfiability is in 
PTIME. Therefore, since it contains the unification problem — it has to be 
PTIME-complete (see [DKM84]). The proof of [Tiu92] doesn’t give a clue of 
how to construct solutions for satisfiable systems. A compact system of notation 
for presenting some solutions of subtype systems was proposed in [Tiu97] imder 
the name alternating directed accyclic graphs (adags). 

It was shown in [Tiu97] that it is possible to design a family of systems of 
subtype inequalities, each of which has a unique solution, and the size of the 
ordinary dag representing the solution is exponential in the size of the system. 
This already happens over the two-element lattice of atomic types. However, as 
it was shown in [Tiu97], a satisfiable system of subtype inequalities has a solution 
whose adag size is polynomial in the size of the system. 

What was missing in [Tiu97] was a uniform PTIME computable assignment 
to each satisfiable system E a canonical solution 5s, and an operation F on 
solutions such that, roughly speaking, the canonical solution which corresponds 

* Partly supported by NSF grant CCR-9417382 and by Polish KBN Grant 
8 TllC 035 14. 
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to Il'ur is F{5s, Sr)- For example, what is so successfully exploited in ML type 
reconstruction is the concept of the most general unifier in the context of the 
imification problem, viewed both: as a canonical solution and as an operation 
on solutions. This issue is clearly centrzd to the problem of building a methodol- 
ogy for type reconstruction of functionad programs with suhtj^jing, since one of 
the main operations for building new constraints which describe typings of the 
program is the operation of set imion. 

The present paper attempts to solve this problem. It turns out that in fact we 
have to keep track, in a certain sense, two canonical solutions, or rather an easily 
accessible information about them. For that purpose we introduce the notion 
of an interval system of inequalities (Section 3). In the same section we also 
introduce our candidate for the above mentioned operation: it is, what we call 
a saturation closure of set union of interval systems. The way of easily deriving 
solutions, we call them generic solutions, for an interval system is presented in 
Section 4. In fact what we obtain is a uniform way of building selected solutions 
for a satisfiable system in all admissible shapes. 

The other fundamental operation for building constraints which describe ty- 
pability is the operation of substitution. It will be a subject of future research 
to see how substitutions and interval systems interfere with each other. 

2 Preliminaries 

Let X be any set of variables (we draw variables from a fixed countable set of 
variables) and let (C, <) be a finite poset of atomic types. We assume that it 
forms a lattice, but without substantieil changes our results ceirry over to the 
case when (C, <) is a disjoint rniion of lattices. Let Tc[X] denote the the least 
set of expressions satisfying: 

- CUXCTc[X]; 

- if cr,r e Tc[X], then (cr t) € Tc[X]. 

The elements of Tc[X\ are called type schemes, or just terms. Tc is a, shorthand 
for 7c [0]- It consists of type schemes which do not contain variables. The ele- 
ments of Tc are called types. It is sometimes convenient to view type schemes as 
trees. A node ly in a type scheme is said to be positive if the number of O’s on 
the path leading to that node is even. Otherwise the node is said to be negative. 

The partial order on {C, <) is extended to 7c- This is the least partial order 
which contains the relation < on C and satisfies: 

(cTi -> (T 2 ) < (n -> T 2 ) iff n < <Ti and <12 < 

An instance of the problem of satisfiability of subtype inequalities over {C, <) 
is a finite set E of expressions of the form a <r, where cr, r € Tc[X\. E is said 
to be satisfiable if there is a substitution 5 : X ^ Tc, such that for every 
(o- < r) e E, 



Tc N <^[S] < r[S]. 




Type Reconstruction for Functional Programs with Subtyping over a Lattice 445 



The above formula means that the inequality a\S\ < r[(5] holds in 7c > where (j[S\ 
(t[(5]) denotes the result of performing the substitution <5 on <r (r, respectively). 
We call every <5 satisfying the above formula a solution of S. To denote the fact 
that <5 is a solution of E we will use the notation 5 |= IT. 



2.1 Shapes 

Let Tt[X] be the set of terms over one constant symbol * and one binary oper- 
ation symbol — with variables from X. The elements of Tt[X] we call shapes. 
Tt stands for Ti[0]. We use a canoniced map ()* ; Tc\X] — > T*[X] which from a 
given term a produces a shape (tr), which is obtained from <j by replacing every 
constant from C by *. We call (tr), a shape of the term cr. 

For a e 7^ let 7c, a = € 7c | (o’)* = he the set of all types of shape a. 

Clearly, for every shape a, 7c, a is a lattice. 

A system E = {(ri < pi), . . . , (t„ < p„)} is said to be weakly satis fiable if 
E, — {(ri,, pi„), . . . , (r„,t, p„,)} is unifiable in T,. Weah satisfiability is clearly 
a necessary condition for satisfiability. It is decidable in polynomial time since 
it is an instance of the unification problem. 



2.2 Two Formal Systems 

We start with the following weaJc deduction system for deriving formulas of the 
form a < r. 



(Ax) E\- cr <T, for every (o < r) € E. 



(Arrow) 



E\- T\ < CTi, E 02 < T2 



(Trans) 



E h O < T, E T < p 
E\- o < p 



As shown in [Tiu97] checking derivability in the above system can be done 
in PTIME. E is said to ground consistent (see [Tiu92]) if for all ci,C 2 G C, 
if 17 h Cl < C 2 , then ci < C 2 holds in (C, <). It was shown in [Tiu92] that a 
system E of subtype inequalities is shape consistent and ground consistent iff it 
is satisfiable. Hence, satisfiability is decidable in PTIME. 

E is said to have redundant variables if there exist two variables x,y G 
var{E), such that E x < y and E \- y < x. We can always easily elimi- 
nate redundant variables from any system (by substitution). We will see that 
they are harmful when constructing generic solutions for interval systems. 

For technical reasons we need yet another system. We will call it a substitution 
system, and denote deri-vability in that system by hmj. Its main rule (MS) is a 
reversal of (Arrow) in the system F defined above. 




446 Jerzy Tiuryn 



(Refl) E hm, a < (T. 

(Const) E hm. Cl < C 2 , for all ci,C 2 6 C, such that C f= ci < C 2 - 
(Ax) E h 

ms a <T, for every (tr < r) E 

^ ^ ^ 1 ) ^ 1 “ ms ^2 ^ *^2 



(MS) 



(Trans) 



S |-m« 0-1 ^ 0-2 < n ->■ T2 

E f“ ms O’ ^ T, Eh ms ^ P 
E 1“ rns ^ — P 



hms will be used in Section 3. It will be the main tool in establishing equi- 
valence properties for interval systems. 



2.3 Solutions and Admissible Shapes 

Let 2^ be a system of subtype inequahties and assume that it is subt 3 q)e sat- 
isfiable. What are the possible shapes of solutions? In order to answer this 
question we have to consider an instance of the unification problem 27*. Let 
s* : var{E) -> %[X] be the most general unifier of 2'*. Without loss of general- 
ity we may assume that each term assigned by s* has all its variables contained 
in var{E). Call these variables independent and let Ivar{E) denote the set of £ill 
independent variables. It should be noticed that the notion of an independent 
variable depends on the choice of the most general unifier of E». For example, 
ii E = {x < y, X < z}, then we have three most general solutions which satisfy 
the above condition about the variables: 
si.(a;) = X, su{y) = x, si.(z) - x. 

S 2 .(a:) = y, S 2 ,(y) = y, S 2 .(z) = y. 

S3*{x) = Z, S3*(y) = 2, S3*(2) = 2. 

Choosing si* as the most general solution we get Ivar{E) — {i}, while 
choosing S 3 , we get Ivar{E) = { 2 }. 

The intuition behind the notion of independent variables is that we can freely 
choose arbitrary shapes for them, the rest of the variables gets shapes uniquely 
determined by s*. 

Now, returning to the question of possible shapes of solutions of 2, if 5 : 
var(E) - 4 ^ 7c is a solution, then, passing to shapes we get that (5), is a imifier 
of 2,, and therefore it is cm instance of the most general unifier s*. This means 
that there is a substitution s' : Ivar(E) ->■ 71 such that (J), = s' o s*. For that 
reason we call every shape assignment which is an instance of the most general 
unifier an admissible shape assignment. Since every admissible shape assignment 
is uniquely determined by a substitution s : Ivar{E) v/e will often refer 

to s when we want to fix a possible shape of subtype solutions of 2. Hence the 
following definition. Let s : Ivar{E) ->Tt he any shape substitution. The space 
of solutions of 2 in the shape assignment s is defined as follows 

SOL{E, s) — {S : var{E) -^Tc \ d\= E and (i5), = s o s,}. 
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In particular, there is a special shape substitution (the smallest one) sq '■ 
Ivar{S) —¥ Tt, where sg{x) = *, for all x € Ivar{E). This substitution we 
will call ground shape substitution and all solutions of S which match the shape 
assignment sg ° s* will be called ground solutions. 



2.4 Alternating DAGs 

In this section we recall a moderately extended notion of an alternating directed 
accyclic graph {adag in short) which was introduced in [Tiu97]. The extension 
which we are going to describe here consists in allowing variables, as well as 
top and bottom (in various shapes). Let A be a set of variables. We assume 
that every variable x € X comes with a shape ax € T*\Y] assigned to it.^ 
We also assmne that we have a supply of symbols and Tq, indexed with 
shapes a ranging over T,\Y]. Symbols and Tq will be called bottom and top, 
respectively. We will define the set Ac,y{X\ of adags with variables in X and 
shapes having their variables in Y. 

Nodes of adags are labelled by the following symbols: 

(•) -4- labels nodes of out-degree 2 and the two edges leaving such nodes are 
labelled: one with 0, and another with 1. 

(•) The elements of C, variables of X, bottoms and tops label nodes of out- 
degree 0. 

(•) The symbols FI and U label nodes of positive out-degree. Different nodes 
labelled U or fl may have different out-degrees. 

Alternating directed accycfic graphs are defined by induction on the number 
of nodes. Simultaneously with the definition of am adag we define a shape of 
every node in it. The definition follows. 

(I.) One element aidag. We have the following possibilities for labelling of its 
node: 

- It is labelled by c € C; the shape of this node is *. 

- It is labelled by x G A; the shape of this node is ax- 

- It is labelled by J_q or T„; the shape of this node is a. 

(II.) Suppose d is an adag and we have selected two nodes vi and V 2 in d, whose 
shapes are ai and a 2 , respectively. We allow the situation when vi = V 2 , i.e. 
when we have selected only one node. Then the graph obtained from d by adding 
one new node v, which is labelled rmd two edges: one from v to uj (labelled 
0), and one from v to V 2 (labelled 1) is cm adag. The shape of u is ai -4 a 2 and 
the shapes of other nodes are inherited from d. 

(III.) Suppose d is an adag and we select n nodes of d (n > 0) vi,.. . ,v„, all 
of the same shape a. Then the graph obtmned from d by adding a new node v, 
which is labelled U or fl, and adding n new edges: for each 1 < i < n one edge 
from V to Vi, is an adag. The shape of v is a and shapes of the other nodes are 
inherited firom d. 

(IV.) Disjoint union of two adags is an adag. The shapes of nodes are inherited. 
The set Y is in general independent of X. 



1 
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In fact we will be interested only in adags which are rooted. We say that 
a node t; in an adag d is a root if every other node of d is reachable from v 
along a directed path, i.e. a path which follows edges, respecting their direction. 
A root of an adag is unique (if it exists). By Ac,y[X] we denote the set of all 
rooted adags with variables in X and shapes in %[Y]. The set Ac,yW\ i® simply 
denoted Ac,y- In a similar way, Ac stands for ^c,0- Prom now on, by an adag 
we will always mean a rooted adag. A shape of an adag is the shape of its root. 

An adag assignment is a function S : X ^ Ac,y[^\ which preserves shapes, 
i.e., for each x £ X, the adag J(x) is of shape ax- 

Given an adag d 6 Ac,y[X], we naturally define an adag d[5] £ Ac,y[X] 
which results from d by simultaneously substituting in d for each node la- 
belled X the adag J(x). This operation of substitution gives rise to a function 
5 : Ac,y[X] Ac,y[X]. We do not introduce a special notation for this function 
since it is an extension of 5. Clearly the shapes of d and d[J] are equal. 

Let d £ Ac,y be an adag without vciriables. It may contain some occurrences 
of the bottom and top symbols with shapes ranging over T*\Y], which actually 
may contain some variables. Given any shape substitution s :Y -¥Tt, the adag 
d[s] £ Ac is obtained from d by replacing each bottom (top) symbol -La (Ta) 
occurring in d by ±aW (Ta[s])- 

Finally we have a natur^ function of unfoldment U : Ac 7c which 
unfolds an adag d £ A into a type U{d) £Tc, under proviso that each bottom 
(top) symbol J.q (Ta) occurring in d (the shape a must, in that case, belong 
to 7i) is replaced by the least (greatest) element in the lattice 7c, a- Clearly the 
shape of U (d) is the same as the shape of d. 

Obviously we can use adag notation for solutions of a system S. We say that 
5 : var{S) Ac is a solution of if 17 o 5 is a solution in the original sense. 

3 Interval Systems 

An interval system 7 is a finite function whose domain is a finite set X of 
variables; it assigns to each x £ X a pair I{x) = {Il{x),Ir[x)), where Il{x) 
and Ir{x) are finite subsets of Tc[X] — {x}. 

We think of 7 as being a finite system of inequalities: {<j < x \ x £ X,a £ 
7i,(x)} U {x < a I X £ X,(T £ 7ij(x)}. In pmticular the expression I a < t 
means that the inequality cr < r is derivable from the above system. Also, a 

substitution 5 : X -^Tc is said to be a solution of 7, denoted ^ f= 7 if satisfies 

all inequalities of the above system. 

Interval systems will be of interest to us since there is a canonical way of asso- 
ciating an interval system Is with an arbitrary system X of subtype inequalities. 
Its domain is X = var{E) and the definition of the frmction follows. 

Ie,l{x) = {a £ Tc[X] - {x} | 27 h cr < x}, 

IeA^) = € Tc\X] - {x} I 27 h X < o}. 

It follows that if 27 has no redundant variables, then Is has no redundant vari- 
ables either. 
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It is easy to show that if S is ground consistent, then S and Is have the same 
solutions. The mmn point of introducing the notion of an interval system is that 
it behaves very well (see Proposition 1) when the original system is expanded 
by adding some more inequalities (this is what happens when we build a set 
of constraints which express typability of a lambda term). Another important 
property is that it is very easy to get some solutions (we will call them generic 
solutions) of an interval system, should it be satisfiable (see Theorem 1). 

We say that an interval system I is saturated if for every x e X and for every 
(T e Tc[X\ — {x} the following two properties hold: 

(a.) If / b X < (j, then a € /r(x); 

(b.) If J f- cr < X, then a € Il(x)- 

Clearly, for every S, the interval system Is is saturated. 

For an interval system I let I denote the least satmated interval system 
which contains I. It is defined as follows. It has the same domain X as I and for 
X € X we have 

Tl(x) — {(t £ Tc[X\ - {x} I 7 h o- < x}, 

7r(x) = {a £ Tc[X] - {x} I 7 h X < <r}. 

It is easy to check that 7 is the least saturated interval system containing 7. 

Our main concern in this section is to determine how the interval system 
Isur depends on the interval systems Is and 7p. It turns out that Is 'J Ir, 
in general, is not equal to Isur- It even fails to be saturated, as the following 
example shows. 

Example 1. Let S = {y a< x}, and let T’ = {x < 6 j/}. Then 

Is{x) = ({y -> a},0), 

7£(y) = (0,0), 

7r(x) = (0,{6-^y», 

7r(y) = (0,0). 

Clearly we have: 

Is\jr{x) = ({y a}, {6 -4 y}) 

Isurly) = ({a}, {ft}), 

while IsiMr is not saturated. 

In the above example we have Isillr — Isur- In general, however, even 
this is not necessarily the case. 

Example 2. Let = {y ^ a < x} and E = {b —¥ a <y a}. Then Isur{x) = 
({6 -4 a, y -)■ g}, 0)^ and Isur{y) = (0, {ft})- 
However, Is D Ir = Is 'J Ir Isvr- 

The systems Isi-ilr and Isur of Example 2, even not being equal, are 
equivalent in a certain intuitive sense. Next we are going to present a suitable 
notion of equivalence. 

Two interval systems 7, J with the seime domain X are said to be equivalent 
(we denote it by 7 ~ J), if for every x £ X: 
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(a) Vct G /l(x)3<t' g Jl{x), J ^ ms ^ ^ ^ » 

(a’) Vcr G Jl(x)3ct' G /l(x), / a < a'; 

(b) Vr G 7fi(x)3r' G Jfi(x), J t' < t; 

(b’) Vr G Jfl(x)3r' G Ir(x), I hm, r' < r. 

This notion of equivalence will prove useful in studying the relationship be- 
tween interval systems and the original systems of subtype inequahties. Some of 
the basic properties of ~ are smnmarized below. 

Proposition 1. Let I and J be interval systems. 

(i) If I J, then for all a,r € Tc[X\, if I bm, a <t then J cr <r. 

(ii) If I J, then I and J have the same solutions. 

(iii) ~ is an equivalence relation on interval systems. 

(iv) Let E and F be arbitrary systems of subtype inequalities such that E Li F is 
satisfiable and contains no redundant variables. Let I, J be interval systems such 
that I C Is, J C Ip, I Is and J ~ Ip. Then I U J C Isur and moreover 
7 U J ~ 7x:ur- 7n particular IpuP ~ Is U Ir- 

4 Generic Solutions 

Let 7 be an interval system whose domain is a finite set of variables X. Assume 
that 7 is shape consistent. Let s, : A -> 7^ [X] be the most general unifier of the 
system 7, = {(<r)* = x | x G A, cr G 7l(x) U 7ij(x)}. Let Y = Ivar{I) be the 
set of independent variables. We define two functions Si,l, <5/,h : X Ac,y[X\ 
as follows. For x G X, if Il{x) = 0, then 5j^i{x) = To have a succinct 

notation we use dags to represent shapes s»(x), for x £ X. Otherwise we set 
to be an adag obtained as follows. Let 7/,(x) = {cri, . . . , cr„}, with n > 1. 
First we make a dag from ai,... ,<r„ by sharing nodes which correspond to 
equal subterms (i.e. nodes of this dag are in one-to-one correspondence with 
subterms of cri, . . . , cr„). It has n natmrally distinguished nodes vi,. . . ,Vn which 
correspond to the terms cti, . . . , <r„. The adag Sj^l(x) is obtained from the above 
described dag by adjoining the root v labelled U, plus adding n edges (v, Vi), for 
i — 1, . . . ,n. 

The adag Sj^r(x) is defined dually to 5 /,l(x) by replacing ± by T and U by 

n. 

According to the remcirks of Section 2.4 both functions can be extended to 
^I,L, Si,R ■ Ac,y[X] ->• Ac,y[X\. 

Proposition 2. Let X — var{I), Y — Ivar{I) and assume that I is shape 
consistent and contains no redundant variables. Then there exists n < |X| such 
that 

(i) For every x € X, Sjp(x), S" ^{x) G Ac,Y 7 »-e. after n iterations 5'}p{x) and 
r{x) contain no variables. 

(ii) For every m>n we have — S” p and S^r = r- 

(iii) The size of S'} p and is linear in the size of I. 
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Proof. For the proof of (i) and (ii) let us introduce the following relation ■< 
in the set X. This is the least transitive and reflexive relation satisfying the 
following condition: for all Xi,X 2 € X, for every cr e /l(x 2 ), if xi occurs in a, 
then xi -<X 2 . Since I is shape consistent and contains no redundant variables, 
it follows that is a partial order. 

It follows from the deflnition of that for all Xi,X 2 € X, if X 2 occurs in 
^/,l(xi), then xi -< X 2 , i.e. Xi is strictly smaller than X 2 - Hence, for every k and 
all xi, X2 e X, if X2 occms in < 5 / £,(xi), then there exist zo,...,ZkEX such that 



X2 — Zk ^ Zk-l ^ Zi < Zq — Xi. 



Thus there exists n < \X\ such that for every x G X S'jj^{x) contains no variables. 
The argument for Sj^r is similar. Clearly (i) eind (ii) follow from this observation. 

(iii) follows from the observation that computing successive iterations of 5j ^ 
amoimts to shifting pointers, i.e. the size does not increase. In the beginning the 
function can be viewed as an adag (not rooted) together with a pointer, 
for every variable x G X, indicating the node which starts the adag Si^l(x). Of 
course, as usual, common subterms are shared. Then computing the ‘fixed-point’ 
iteration ^ consists in the following step: for every variable x which occurs 
as a label of a leaf^ v in the adag representation of we remove that node 
and change every edge of the form {u,v) to the edge {u,Vx), where u* is the 
node which represents the root of the adag Sj^l(x), i.e. we remove v and redirect 
every edge pointing to n as pointing to the root of Same remarks apply 

to Si^R. This completes the of Proposition 2. 

We denote the n-fold composition where n is as in Proposition 2, by 
S} The same applies to 5} jj. Let us recall (see Section 2.4) that for every shape 
substitution s :Y % and an adag d G Ac,Yi the adag d[s] G Ac is obtained 
from d by applying s to all vciriables in shapes of J. and T. This gives rise to a 
function : X -> Ac, which is defined by (<5J_i,[s])(x) = (<5/,t(2;))[s]. The 

same applies to the function ^[s] : X — Ac- 

It should be clear that the size of 6} ^[s] is linear in the size of 5} ^ and in 
the size of s. In particulcir, if s assigns dags (which represent shapes) to the 
variables, then the size of can be maintained quite succinct. 

Let us observe that for every x G X, when passing to the unfoldment of 
(J|^(x))[s] we get the least upper boimd of terms a G Il{x) in which every 
variable y occurring in cr is replaced by the unfoldment of (6} i(y))[s]. Hence we 
have the following formula. 

C^((«JUW)W)= u (1) 

<r€lL(x) 



The above least upper bound is taken in Tc,a, where the shape a = s(s*(x)). 



^ Such nodes correspond to free occurrences of variables in the values 5/,i(y), with y 
ranging over X. 
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In a similar way we obtain the formula for {5} x,(a:))[s], 

u{{suix))[s])= n 

<relR(x) 

Theorem 1. Let I be an interval system which is satisfiable, saturated and con- 
tains no redundant variables. Let Y = Ivar{I). Then for every admissible shape 
assignment s : Y the functions U o 57£,[s] and U o belong to 

SOL{I,s). The size of 6} i^[s] and <5/ /{[«] is linear in the size of I and the size 
of s. 

Proof. We give the proof for S = U o5} £_[«]. The proof for the other function is 
completely similar by dualization. Let a: G Jf. We will show the following two 
properties. 



Tc \= S{(t) < 5{x), for a e Il{x). 

Tc [= <5(a:) < S{t), for r £ Ir{x). 

The property (2) follows immediately from (1). Indeed: 5(x) = U((St t (a:))[sl) = 

U€/.(x) pMSIlM)] = <5(p) > 5(^)- 

For the proof of (2) we introduce the following partial order ^ on X. This is 
the least transitive and reflexive relation which satisfies the following condition: 
for every x € X, for every p € Iiix) and for every variable y £ X which occurs 
in p, y d ® holds. 

■< is indeed a partial order since I is satisfiable and contains no redundant 
Vciriables. Now, we prove (2) by induction on x with respect to <. Assume (2) 
holds for all variables strictly less th 2 in x and take Einy t 6 Ir{x). It follows from 
(1) that it is enough to prove 

Tc 1= <J(cr) < 6{t), for all cr G Il{x). (2) 

Take any a G and consider any paur cr',r' such that \a'\ = 1 or |r'| = 1 

and <7 < r hor cr' < r'. Since I is satisfiable it follows that that if both cr' and r' 
contain no variable, then 7c h cr' < r' holds. So assume that a' or r' contains 
a variable. Again, by satisfiability of I it follows that we have to consider only 
the following two cases: 

(A) a' is a variable, say a' = y. 

(B) r' is a variable, say r' = y. 

In case (A), since 1 is saturated we conclude that r' G Since y < x, 

it follows by induction hypothesis that Tc |= 5{y) < ^(r'). In case (B) it follows 
from (1) that Tc <^(cr') < 5{y). Since the terms cr', r' are arbitrary, (2) follows. 
This completes the proof of (2). Therefore we have shown that 5 is a solution of 
I. 

Linearity of the size of <5/x,[s] follows from earlier remarks. This completes 
the of Theorem 1. 

The next result is an immediate corollary of Theorem 1. 
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Corollary 1. Let S be a satis fiable system of subtype inequalities. 

(i) If E is satisfiable in one shape assignment, then it is satisfiable in all admis- 
sible shape assignments. 

(ii) If E contains no redundant variables and Y — Ivar{E), then for every ad- 
missible shape assignment s :Y — t 7i, the functions U ^[s] and 

belong to SOL{E, s). The functions ^[s] and are PTIME computable 

from E and s. 

5 Conclusion 

The paper introduces the notion of an interval system of inequalities. Interval 
systems are useful in the following three respects: 

— It is easy to construct the interval system Is, for any system E of subtype 
inequalities. Is and E have the same solutions. 

— Interval systems behave well with respect to their set union (Proposition 2). 

— It is easy to compute specific solutions (we call them generic solutions) in 
every admissible shape for a satisfiable interval system (Theorem 1). 

What remains to be done is a study of behavior of interval systems under 
the operation of substitution. Set union and substitution are the only operations 
used in the process of generating constr 2 iints (in the bottom-up fashion) for the 
problem of type reconstruction for functional programs with subtyping. 
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